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Non-aqueous  phase  liquids(NAPL)  present  as  a  separate  phase  in  subsurface 
environment  pose  a  long-term  threat  to  the  quality  of  groundwater  due  to  their  low  sol- 
ubility in  water.  Characterization  of  spatially  distributed  NAPL  residual  saturation, 
as  well  as  aquifer  hydrogeologic  parameters,  is  of  paramount  significance  to  design 
site-specific  efficient  remediation  technologies.  The  objective  of  this  research  is  to  de- 
velop an  optimal  estimation  algorithm  for  predicting  three-dimensional  distributions 
of  NAPL  residual  saturation  and  Darcy  flux  from  concentration  measurements  of  par- 
titioning and  nonpartitioning  tracers.  This  dissertation  applies  stochastic  methods  to 
the  analysis  and  prediction  of  transport  of  a  partitioning  tracer  in  a  three-dimensional, 
heterogeneous,  NAPL-contaminated  aquifer.  Partial  differential  equations  for  ensem- 
ble moments  (mean,  auto-covariances,  and  cross-covariances)  are  derived  by  applying 
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perturbation  techniques  to  the  advection-dispersion-retardation  equation.  The  con- 
centration uncertainty  is  assumed  to  be  caused  by  the  small-scale  fluctuations  in  a 
steady-state  Darcy  flux  field  and  NAPL  residual  saturation  field.  These  equations 
are  solved  numerically  using  a  finite-difference  procedure. 

The  unconditional  simulation  results  reveal  important  information  on  the  sta- 
tistical correlation  between  the  concentration  observations  and  the  aquifer  hydro- 
geochemical  parameters.  The  quality  of  NAPL  estimation  from  concentration  ob- 
servations depends  on  conductivity-NAPL  correlation;  with  the  perfectly  negative 
conductivity-NAPL  correlation  generating  the  strongest  correlation  between  concen- 
tration measurements  and  the  estimated  NAPL  and  Darcy  flux.  It  is  found  that 
tracer  concentration  and  NAPL  saturation  are  correlated  over  larger  distances  in  the 
direction  of  mean  flow  than  transverse  to  the  direction  of  mean  flow.  Furthermore 
the  magnitude  of  this  correlation  is  approximately  constant  over  the  duration  of  the 
tracer  experiment  implying  that  all  concentration  measurements  taken  at  a  particular 
location  contain  equal  information  about  NAPL  distribution  within  the  experimental 
domain. 

A  distributed  parameter,  extended  Kalman  filter  is  developed  to  estimate  spa- 
tially distributed  NAPL  residual  saturation  and  Darcy  flux,  and  to  predict  site-specific 
movement  of  a  reactive  solute  plume  in  saturated  subsurface  environments.  Through 
conditioning  on  concentration  measurements,  uncertainties  associated  with  model 
predictions  can  be  reduced.  Because  the  optimal  estimation  algorithm  is  computa- 
tionally intensive  for  the  three-dimensional  problem,  a  simplified  sequential  condition- 
ing algorithm  is  developed  to  condition  the  Darcy  flux  field  only  on  nonpartitioning 
tracer  measurements  and  to  condition  the  NAPL  field  only  on  partitioning  tracer 
measurements.  The  performance  of  the  algorithm  is  illustrated  for  five  synthetic 
examples. 
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Data  from  field  tracer  tests  conducted  at  OU-1  test  cell  in  Hill  AFB,  Utah, 
are  analyzed  to  evaluate  the  applicability  of  the  optimal  estimation  algorithm  devel- 
oped in  this  dissertation  for  field-scale  parameter  estimation.  The  three-dimensional 
distributions  of  NAPL  and  Darcy  flux  predicted  by  the  algorithm  compare  favorably 
with  those  estimated  from  soil  coring  data  and  temporal  moment  analysis. 
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CHAPTER  1 
INTRODUCTION 

1.1    Research  Background 

Ground  water  is  the  major  source  for  drinking  water  for  many  people  around 
the  world.  More  than  half  of  the  population  (52.5%)  of  the  United  States  relies  upon 
a  groundwater  source  [116].  The  quality  of  ground  water  is  of  paramount  importance 
to  human  health.  During  the  past  decades  in  the  U.S.,  enduring  effort,  driven  by  the 
increasingly  growing  environmental  awareness  as  well  as  consistently  enacted  legisla- 
tion such  as  Resource  Conservation  and  Recovery  Act  (RCRA)  and  the  Comprehensive 
Environmental  Response,  Compensation  and  Liability  Act  (CERCLA),  has  been  put 
on  by  groundwater  scientists  and  engineers  to  develop  a  number  of  techniques  for 
both  containing  and  remediating  soil  and  groundwater  contamination  [47]. 

Accumulated  evidence  suggests  that  one  of  the  primary  causes  of  ground- 
water contamination  is  historical  and  present  industrial,  agricultural,  and  commer- 
cial activities.  The  widespread  production  and  use  of  industrial  solvents  and  liquid 
petroleum  products  have  provided  ample  opportunity  for  subsurface  contamination 
from  leaking  underground  storage  tanks  and  pipelines,  hazardous  waste  sites,  and 
surface  spills  [59].  In  the  United  States,  it  is  estimated  that  more  than  300,000  sites 
may  have  contaminated  soil  and  ground  water  requiring  some  form  of  remediation 
[90].  The  technological  response  to  these  statutory  and  regulatory  demands  over  the 
past  decades  has  almost  exclusively  been  the  application  of  so-called  "pump-and- 
treat"  technology.  Simply  put,  this  technology  involves  extracting  water  from  the 
ground  below  the  water  table  using  standard  water-well  technology.  The  extracted 
and  contaminated  water  is  then  treated  with  established  above-ground  technologies 
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such  as  air-stripping  or  adsorption  in  granular  activated  carbon.  However,  traditional 
pump-and-treat  methods  appear  to  be  seriously  ineffective  when  dealing  with  aquifers 
that  are  contaminated  by  Nonaqueous  Phase  Liquids  (NAPL).  In  such  cases,  it  is  of 
great  interest  to  explore  more  innovative  remediation  methodologies  in  order  for  the 
aquifer  cleanup  to  be  achievable  within  reasonable  time  frames  and  at  an  affordable 
cost. 

The  contamination  of  subsurface  formations  by  NAPLs  is  more  physically 
complex  than  is  contaminated  by  dissolved  phase  constituents.  An  analysis  of  the 
capillary  and  viscous  forces  acting  on  the  NAPL  ganglia  in  aquifers  by  Hunt  et  al.  [69] 
shows  that  the  trapped  residual  is  often  immobile  under  typical  hydraulic  gradients. 
The  immobilized  portion  that  is  retained  by  the  soil  is  referred  to  as  the  residual 
saturation  of  NAPL,  and  is  expressed  as  the  fraction  of  pore  space  occupied  by  the 
NAPL.  At  the  stage  of  residual  saturation,  the  NAPL  phase  is  discontinuous  and 
may  occur  as  single  droplets  or  ganglia  [99].  NAPL  contaminants  include  a  wide 
range  of  industrial  compounds  such  as  gasoline,  fuel  oils,  chlorinated  and  fluorinated 
hydrocarbons,  creosote,  and  transformer  oils  [88].  Petroleum  fuel  mixtures,  being  less 
dense  than  water,  are  commonly  referred  to  as  LNAPLs  and  will  float  on  top  of  water 
table.  Dense  NAPLs,  generally  formed  by  chlorinated  hydrocarbons  (CHCs)  such  as 
trichloroethylene  (TCE),  tetrachloroethylene  (PCE),  and  other  halogenated  solvents, 
sink  rather  rapidly  through  water  until  they  encounter  a  zone  of  low  permeability 
such  as  clay  or  bedrock. 

Reasons  why  chlorinated  solvents  are  common  groundwater  contaminants  are 
that  they  exhibit  (1)  low  liquid  viscosities  (are  able  to  move  easily  into  the  subsurface); 
(2)  low  interfacial  tensions  with  water  (are  able  to  enter  into  water-wet  fractures 
relatively  easily);  (3)  high  volatilities  (are  able  as  gases  to  diffuse  rapidly  downwards 
into  the  unsaturated  zone);  (4)  low  absolute  solubilities  (are  difficult  to  remove  from 
the  saturated  zone  by  natural  advection);  (5)  high  solubilities  relative  to  drinking 
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water  limits  (are  able  to  cause  significant  health  risks  even  when  small  amounts 
dissolve);  (6)  low  partitioning  to  soils  (are  not  retarded  by  aquifer  materials);  (7) 
low  degradabilities  [94].  NAPLs  are  therefore  a  long-term  source  of  contamination, 
and  their  presence  causes  profound  threat  to  ground  water  quality  as  they  slowly 
dissolve  into  aqueous  phase.  For  traditional  methods  such  as  pump-and-treat  and 
cutoff-wall  enclosures,  it  may  take  decades,  or  be  simply  impossible,  to  remediate  a 
NAPL-contaminated  aquifer  [78]. 

The  need  for  effective  subsurface  remediation  strategies  at  NAPL  contami- 
nated sites  has  resulted  in  accelerated  research  and  experiments  for  developing  new 
in-situ  technologies.  Experimental  technologies  are  generally  divided  into  two  cate- 
gories: 1)  mass  removal;  and  2)  mass  destruction.  For  in-situmaiss  destruction, 
chemicals  are  mixed  with  water  that  is  injected  into  the  source-zone,  and  undergo  reac- 
tions with  the  NAPL  such  as  chemical  oxidation,  biochemical  reduction,  and  microbial 
degradation.  Mass  removal  technologies,  which  are  a  type  of  enhanced  pump-and- 
treat,  include  circulating  steam  or  water  containing  chemical  additives  through  the 
NAPL  zone. 

Flushing  surfactants  and  cosolvents  such  as  alcohols  into  the  underground 
source-zone  is  currently  receiving  much  research  attention.  The  use  of  organic  co- 
solvents  and  surfactants  for  in-situ  aquifer  remediation  is  based  on  four  principles 
[99,  96]:  (1)  cosolvents  and  surfactants  decrease  the  interfacial  tension  between  the 
aqueous  phase  and  nonaqueous  phase  liquids  (NAPL),  inducing  mobilization  of  the 
residual  NAPL  phase;  (2)  cosolvents  and  surfactants  increase  the  apparent  solubil- 
ity of  nonpolar  organic  chemicals,  enhancing  the  release  of  organic  constituents  of 
an  immobile  NAPL  phase;  (3)  cosolvents  and  surfactants  reduce  sorption,  and  the 
resulting  decrease  in  retardation  facilitates  faster  advective  transport  of  dissolved 
contaminants;  and  (4)  mass-transfer  rates  between  solution  phase  and  sorbed  phase 
generally  increase  in  the  presence  of  cosolvents  and  surfactants.  Thus  circulation  of 
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the  water/chemical  mixture  will  remove  contaminant  mass  much  more  rapidly  than 
conventional  pump-and-treat.  To  avoid  prohibitive  cost,  the  chemical  additives  (sur- 
factants or  cosolvents)  must  usually  be  recycled  at  the  surface.  Contaminants  must 
be  separated  from  the  effluents,  and  disposed  of  or  destroyed. 

Studies  of  the  enhanced  pump-and-treat  techniques  have  been  conducted  pri- 
marily in  the  laboratory  via  column  experiments  [6,  51,  1,  45].  Although  several  field 
tests  are  currently  under  way  in  the  USA  and  Canada,  only  two  pilot-scale  field  ex- 
periments of  in-situ  cosolvent  flushing  to  remediate  contaminated  aquifers  have  been 
reported.  Broholm  and  Cherry  [12]  presented  the  results  from  the  first  pilot-scale 
field  testing  of  cosolvent  flushing  for  enhanced  recovery  of  residual  NAPL.  The  ex- 
periments were  performed  on  a  hydraulically-isolated  test  cell  (4.5m x  5.5m  x  2.3m) 
in  a  shallow  sandy  aquifer  at  Borden,  Ontario,  Canada.  A  small  volume  of  a  ternary 
mixture  of  NAPLs  (10%  chloroform,  40%  TCE,  and  50%  PCE)  was  injected  into  the 
aquifer  about  5m  below  the  water  table.  The  test  cell  was  flushed  with  30%  methanol 
at  a  pore- velocity  of  13  cm/day.  Post  soil  sampling  indicated  a  recovery  of  30%  of 
the  introduced  NAPL  mixture. 

Researchers  from  the  University  of  Florida  (UF)  conducted  another  field  study 
to  evaluate  the  in-situ  cosolvent  flushing  technology  [2,  98,  5]  in  a  test  cell  located 
at  Operable  Unit  1  (OU-1)  at  the  Hill  Air  Force  Base  (AFB)  near  Salt  Lake  City, 
Utah.  The  shallow,  sand-gravel,  surficial  aquifer  (approximately  6.1  m  thick  with  wa- 
ter table  located  at  5.8  m  below  ground  surface)  is  contaminated  with  jet  fuel  (JP-4) 
and  various  chlorinated  organic  solvents  (including  chlorobenzenes  and  chloroalkenes) 
historically  disposed  of  at  the  site.  A  hydraulically-isolated  test  cell  (5m  x  4m)  was 
built  by  driving  10  m  long,  interlocking  sheet  piles  that  penetrated  about  3m  into  the 
clay  confining  unit.  The  cell  was  instrumented  with  12  multilevel  samplers  (MLSs) 
that  allow  simultaneous  groundwater  sample  collection  at  a  total  of  60  locations. 
A  binary  alcohol  mixture  (70%  ethanol,  10%  n-pentanol,  20%  water)  was  pumped 
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through  4  injection  wells  into  the  cell  over  a  10-day  period.  The  dissolved  contam- 
inant constituents  in  the  fluids  collected  in  three  extraction  wells  and  the  12  MLSs 
were  monitored  during  the  tests.  Tracer  tests  and  soil  core  sampling  were  performed 
before  and  after  the  cosolvent  flushing  to  quantify  the  amount  of  LNAPL  that  had 
been  removed  by  the  flushing  experiment.  Preliminary  analysis  of  the  data  indi- 
cates that  about  70%  -  80%  of  the  LNAPL  was  effectively  removed  [5].  The  MLS 
tracer  breakthrough  data  from  this  field  study  will  later  be  used  in  this  dissertational 
research. 

The  use  of  surfactants  and  cosolvents  for  the  remediation  of  NAPL  source 
zones  is  associated  with  many  technical  difficulties.  One  most  apparent  difficulty  is 
caused  by  heterogeneities  in  the  hydraulic  conductivity  and  the  NAPL  distributions. 
Detailed  monitoring,  though  often  infeasible,  can  provide  more  specific  delineation 
of  the  contaminant  source  distribution  so  that  flushing  can  be  focused  on  these  lo- 
cations. For  the  surfactant  and  cosolvent  flushing  technologies  to  remove  the  NAPL 
efficiently,  the  flushing  fluid  must  come  into  direct  contact  with  the  NAPL  to  cause 
mobilization  and  solubilization  and  then  to  carry  the  dissolved  contaminant  mass  to 
the  extraction  wells.  On  the  other  hand,  NAPL  accumulations  lying  in  depressions  at 
the  bottom  of  an  aquifer,  or  in  low-permeability  lenses  within  an  aquifer,  are  partic- 
ularly difficult  to  remove  because  the  flushing  fluid  tends  to  bypass  low-permeability 
regions  and  flow  primarily  through  high-permeability  regions.  Therefore,  an  adequate 
characterization  of  the  contaminated  aquifer  becomes  extremely  important  if  efficient 
remediation  is  desired  [93].  With  poor  site  characterization,  remediation  schemes  may 
be  unnecessarily  expensive,  because  costly  design  may  be  required  to  compensate  for 
uncertainty  [130,  131]. 

The  goal  of  characterization  of  sites  with  groundwater  contamination  is  to 
determine  the  extent  of  the  contamination  and  to  select  and  design  a  cost-effective 
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remediation  strategy.  Data  collection  required  to  formulate  a  conceptual  model  to 
describe  the  mechanisms  of  contaminant  movement  include  [90] 

•  information  to  define  or  estimate  the  horizontal  and  vertical  extent  of  ground 
water  contamination, 

•  information  to  estimate  the  location  of  contaminant  source  areas, 

•  information  to  describe  the  hydrogeologic  setting,  and 

•  information  to  estimate  the  site's  restoration  potential. 

There  are  four  principal  methods  of  NAPL  zone  characterization  available  for 
site  investigation:  (1)  core  sampling,  (2)  cone  penetrometer  testing,  (3)  geophysical 
logging,  and  (4)  tracer  test  methods.  The  particular  advantage  of  tracer  test  methods 
over  the  other  methods  is  that  there  is  compelling  evidence  that  the  sample  volume 
required  for  assessing  NAPL  saturation  is  much  larger  than  that  available  from  either 
core  samples  or  geophysical  logs  [83].  Tracer  tests  sample  much  larger  volume  of 
aquifer  and  should  yield  more  reliable  estimates  of  average  aquifer  hydrogeochemical 
parameters,  such  as  NAPL  residual  saturation,  porosity,  hydraulic  conductivity,  and 
Darcy  flux. 

The  purposes  of  a  tracer  test  often  include  (1)  site  hydraulic  characterization 
to  determine  parameters  such  as  hydraulic  conductivity  and  dispersivity,  using  non- 
partitioning  tracers,  and  (2)  location  of  NAPL  pools,  lenses  and  estimation  of  NAPL 
residual  saturation  using  partitioning  tracers.  Tracers  are  chemicals  that  can  be 
added  to  fluids  in  small  concentrations  and  used  to  follow  fluid  movement  without 
affecting  their  physical  properties.  A  partitioning  interwell  tracer  test  consists  of 
simultaneous  injection  of  several  tracers  with  different  partitioning  coefficients  at  one 
or  more  injection  wells  and  the  subsequent  measurement  of  tracer  concentrations  at 
one  or  more  extraction  (or  monitoring)  wells.  When  tracers  with  different  partitioning 
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coefficients  are  injected  into  the  aquifer,  nonpartitioning  tracers  (such  as  bromide  and 
tritium)  remain  in  the  water  phase  while  partitioning  tracers  move  back  and  forth 
between  water  and  NAPL  phases.  Partitioning  and  nonpartitioning  tracer  molecules 
in  the  water  travel  at  the  flow  velocity,  however,  molecules  of  partitioning  tracers 
do  not  move  while  adsorbed  to  the  NAPL  phase.  The  extent  of  the  separation  of 
partitioning  and  nonpartitioning  tracer  pulses  depends  upon  the  fraction  of  time 
tracers  spend  in  the  NAPL  phase  compared  with  that  in  the  water  phase,  which  is  a 
function  of  NAPL  residual  saturation  and  the  partitioning  coefficient.  The  greater  the 
chromatographic  separation  of  the  tracers,  the  greater  the  NAPL  residual  saturation. 
Hence  it  is  possible  to  estimate  the  average  mass  of  residual  NAPL  over  the  aquifer 
volume  swept  by  the  tracers. 

Characterization  of  spatially  distributed  NAPL  in  the  residual  saturation  phase 
is  a  more  complex  problem  and  often  a  prerequisite  to  remediation.  The  presence 
of  natural  heterogeneities  of  aquifer  parameters  such  as  hydraulic  conductivity  and 
porosity  in  the  field  makes  this  problem  even  more  difficult  to  solve.  The  travel  time 
of  a  partitioning  tracer  particle  to  an  observation  point  (or  in  the  Eulerian  frame 
work,  the  tracer  concentration  at  a  particular  observation  point)  in  the  subsurface 
is  governed  by  the  combined  factors  of  NAPL  content,  properties  of  the  porous  me- 
dia, and  flow  conditions.  Hence  the  tracer  breakthrough  data  at  that  observation 
point  contains  important  information  concerning  local  properties  of  both  the  media 
and  NAPL.  In  a  three-dimensional  sampling  network,  repeated  sampling  of  tracer 
concentration  is  possible  by  installing  multilevel  samplers.  The  question  is  how  to 
best  interpret  these  data  to  achieve  an  optimal  estimate  of  the  spatial  distributions 
of  aquifer  hydrogeologic  parameters  and  NAPL  residual  saturation  at  a  particular 
NAPL  contaminated  site. 
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1.2    Stochastic  Subsurface  Hydrology,  Inverse  Problems 
1.2.1    Flow  and  transport  processes 

Fluids  are  defined  as  materials  that  continue  to  deform  in  the  presence  of  any 
shearing  stress,  thus  the  concept  of  "flow"  is  referred  to  as  this  continuous  deforma- 
tion of  a  fluid.  Because  a  complete  description  of  fluid  motion  relies  on  the  knowledge 
of  boundary  conditions,  mathematical  models  describing  the  transport  of  fluids  are 
essentially  scale-dependent.  With  the  assumption  of  continua,  the  dynamic  and  kine- 
matic relationship  between  fluid,  medium,  and  flow  parameters  at  a  point  within  a 
considered  flow  domain  are  often  expressed  by  partial  differential  equations  (PDEs). 

The  smallest  continuum  scale  applied  in  fluid  mechanics  is  on  the  order  of  a 
micron.  Essential  to  the  treatment  of  fluids  as  continua  is  the  concept  of  a  particle 
which  is  an  ensemble  of  many  molecules  contained  in  a  small  volume.  The  size  of 
the  fluid  particle  should  be  much  larger  than  the  mean  free  path  of  a  single  molecule, 
yet  it  should  be  sufficiently  small  so  that  meaningful  bulk  fluid  properties  can  be 
quantitatively  determined  for  it.  The  description  of  the  fluid  continuum  is  thus  an 
integrated  ensemble  of  all  the  averaged  molecular  processes  such  as  molecular  diffusion 
and  fluid  deformation  characterized  by  the  parameter  of  viscosity. 

The  continua  for  flow  in  porous  medium  can  be  defined  if  the  concept  of 
representative  elementary  volume  (REV)  is  introduced  [7].  The  size  of  the  REV  should 
be  much  smaller  than  the  entire  flow  domain  but  larger  than  a  single  pore  averaging  to 
produce  the  porous  medium  continuum  is  legitimate.  Pore-scale  is  typically  assumed 
to  be  on  the  order  of  10-5m  to  10_1m. 

The  development  of  scientific,  quantitative  theory  of  flow  through  porous  me- 
dia has  spanned  almost  140  years  since  the  discovery  of  Darcy's  law  in  1856.  Con- 
siderable progress  in  the  study  of  the  transport  of  solutes  in  groundwater  has  been 
made  as  a  result  of  growing  concern  about  water  quality  and  pollution.  Darcy's  law 
is  classified  as  a  phenomenal  relationship  that  is  empirically  based  on  experiments  of 
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steady  flow  in  a  vertical  column  of  homogeneous  soil.  Averaging  the  Navier-Stokes 
equation  with  respect  to  pore  boundary  conditions  for  a  fluid  leads  to  a  flux  equation 
with  the  same  form  as  Darcy's  equation.  This  is  done  by  assuming  that:  (1)  flow 
velocity  and  its  local  derivative  are  small  (the  Reynolds  number  based  on  average 
grain  diameter  does  not  exceed  some  value  between  1  and  10  [7];  and  (2)  the  convec- 
tive  acceleration  term  when  integrated  over  a  macroscopic  volume  is  zero  for  uniform 
rectilinear  macroscopic  flow  if  the  medium  is  homogeneous  [25]. 

The  mass  balance  equation  for  water  in  conjunction  with  Darcy's  equation  for 
flux,  and  the  transport  equation  for  aqueous  phase  solutes  form  the  basis  for  the  quan- 
titative description  of  subsurface  flow  and  solute  transport.  Dissolved  material  trans- 
ported by  groundwater  in  the  subsurface  environment  undergoes  processes  of  advec- 
tion,  hydrodynamic  dispersion,  sorption/desorption,  and  biochemical  reactions,  which 
are  described  by  the  traditional  advection-dispersion-retardation-reaction  equation. 
These  equations  have  been  applied  extensively,  using  analytical  solutions  for  simple 
configurations  and  various  numerical  solution  techniques  for  more  complicated  situa- 
tions. Determination  of  the  parameters  involved  in  these  equations  is  often  attempted 
using  controlled  laboratory  experiments  on  samples  of  soil  or  aquifer  material.  For 
example,  the  values  of  permeability  or  dispersivity  can  be  inferred  through  laboratory 
column  subject  to  a  macroscopic  one-dimensional  flow.  The  laboratory  scale,  char- 
acterizing the  dimension  of  common  experimental  setups,  is  therefore  of  the  order 
lfT1  m-10°  m  [31]. 

Flow  and  transport  in  the  field  occur  at  much  larger  scales  over  which  signif- 
icant spatial  variability  in  the  hydraulic  parameters  may  occur.  This  heterogeneity 
is  believed  to  be  responsible  for  the  large-scale  spreading  of  solutes  transported  in 
aquifers.  The  traditional  approach  for  dealing  with  flow  and  transport  is  to  apply  the 
lab-based  models,  and  to  assume  that  the  formation  is  spatially  homogeneous.  How- 
ever, the  predictive  capability  of  such  models  is  limited  largely  due  to  the  difficulties 
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of  determining  the  appropriate  parameters  at  field  scales.  In  field  monitoring  appli- 
cations, it  is  cost-prohibitive  and  virtually  impossible  to  obtain  parameters  at  every 
point.  Field  experiments  in  the  past  two  decades  have  indicated  that  use  of  lab-based 
parameters  in  the  deterministic  models  fails  to  predict  the  evolution  of  solute  plumes 
accurately.  Findings  in  some  early  controlled  field  experiments  [79,  121,  50,  77,  53] 
have  demonstrated  that 

•  The  solute  plume  displays  an  irregular  spatial  pattern,  unlike  the  one  predicted 
by  the  conventional  advective-dispersive  equation  (ADE)  incorporating  constant 
coefficients; 

•  The  spread  of  the  solute  body  around  its  center,  determined  by  any  conceivable 
measure,  is  much  larger  than  the  one  associated  with  pore-scale  dispersion. 
The  apparent  effective  dispersion  coefficients  are  much  greater  by  orders  of 
magnitude  than  those  determined  with  the  aid  of  laboratory  samples,  and  may 
grow  with  the  travel  time  of  the  solute  body  [32]. 

1.2.2    Stochastic  approach  to  flow  and  transport  modeling 

If  the  solute  transport  equation  was  solved  accurately  and  microscopic  vari- 
ation of  water  flux  was  perfectly  known,  there  would  be  no  need  to  simulate  the 
macrodispersion  process,  which  would  emerge  naturally  from  advection  and  molecu- 
lar diffusion  [16].  However,  parameters  at  the  field  scale  are  always  spatially  variable, 
and  thus  are  uncertain  due  to  the  limited  number  of  error-prone  measurements  avail- 
able to  infer  them.  If  uncertain  parameters  are  used  as  model  input,  one  must  realize 
the  outputs  of  the  model  are  also  subject  to  uncertainty.  Therefore  it  is  desirable  to 
find  an  approach  to  incorporate  the  limited  number  of  measurements  with  the  exist- 
ing physically-based  models  so  that  the  influence  of  heterogeneity  and  uncertainties 
can  be  reflected  in  model  predictions.  The  development  of  stochastic  models  allows 
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researchers  to  examine  naturally  deterministic  phenomenon  in  a  probabilistic  frame- 
work and  to  effectively  quantify  the  uncertainties  associated  with  the  predictions. 

To  account  for  the  variability  and  uncertainty,  formation  properties  are  re- 
garded as  spatially  random  functions.  A  survey  of  field  data  found  in  the  literature 
for  properties  at  the  local  scale  has  been  presented  by  Freeze  [48].  It  was  reported 
that  hydraulic  conductivity  K  often  exhibits  a  log-normal  distribution  with  the  vari- 
ance ofnK  ranging  from  0.4  to  4.  Freeze  [48]  also  indicated  that  the  porosity  is  much 
less  variable  than  the  hydraulic  conductivity.  Delhomme  [42]  summarized  data  about 
the  regional-scale  heterogeneity  for  a  few  aquifers  in  France.  Again,  the  log-normal 
distribution  was  found  to  adequately  describe  the  transmissivity  distributions  with 
variance  values  between  0.7  and  5.  Delhomme  [42]  also  mentioned  correlation  scales 
(defined  as  the  length  over  which  the  random  variable  is  statistically  correlated)  in 
the  horizontal  plane  is  in  the  range  of  1  km-20  km.  Clifton  and  Newman  [22]  analyzed 
data  obtained  from  148  wells  in  the  Avra  valley  where  the  aquifer  thickness  is  about 
200  m,  and  horizontal  extent  is  of  the  order  of  tens  of  kilometers.  The  variance  of 
the  log-transmissivity  was  found  to  be  0.5,  whereas  the  correlation  scale  was  of  the 
order  of  8  km. 

Theoretical  research  treating  groundwater  flow  and  solute  transport  in  a  prob- 
abilistic framework  has  been  under  rapid  development  since  early  1970s  (see  papers 
by  Dagan  [27,  28,  35],  Gelhar  [55,  56,  57],  and  Gelhar  and  Axness  [58]).  A  variety  of 
techniques  involving  analytical  and  numerical  methods  have  been  investigated  along 
with  several  field  studies  that  systematically  explored  the  variability  of  hydraulic 
properties.  Mathematical  description  of  stochastic  flow  and  transport  is  often  defined 
through  stochastic  partial  differential  equations  (SPDEs)  and  the  solution  includes 
ensemble  statistics  such  as  mean,  variance  and  auto-(cross-)  covariances  between  the 
considered  random  fields.  Work  in  the  early  stages  was  focused  on  the  derivation  of 
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effective  parameters  such  as  effective  hydraulic  conductivity  and  effective  macrodis- 
persivity.  Using  effective  parameters  derived  from  ensemble  moments  in  this  manner 
requires  the  ergodic  hypothesis  which  states  that  the  ensemble  statistics  of  a  random 
process  are  equivalent  to  the  statistics  averaged  over  a  single  realization.  The  ergodic 
hypothesis  thus  provides  a  bridge  for  connecting  the  ensemble  statistics  to  a  single 
realization  which  reflects  the  particular  aquifer  properties  of  interest.  The  ergodic 
hypothesis  requires  the  assumption  that  the  scale  of  the  domain  of  interest  is  exces- 
sively large  compared  to  the  local  correlation  scale  of  the  considered  random  variables. 
Representative  saturated  zone  stochastic  modeling  efforts  include  work  done  by  Smith 
and  Schwartz  [113,  114],  Tang  and  Pinder  [125],  Tang  and  Schwartz  [126],  Gelhar  and 
Axness  [58],  McLaughlin  and  Wood  [86,  87],  Graham  and  McLaughlin  [61,  62],  Tomp- 
son  and  Gelhar  [127],  Kapoor  and  Gelhar  [75,  76],  and  Russo  [108,  109].  Field-scale 
tracer  tests  were  performed  at  Canadian  Force  Base  Borden  [79,  50,  77]. 

The  moments  (for  example,  concentration  mean  and  variance)  of  the  random 
fields  predicted  by  a  stochastic  model  can  be  quantified  with  or  without  site-specific 
measurements  of  the  relevant  random  fields.  In  the  first  case  the  moments  are  uncon- 
ditional, while  in  the  second  case  they  are  conditional  (on  the  values  of  measurements). 
It  is  largely  accepted  that  concentration  predictions  based  on  unconditional  ensemble 
moment  analysis  display  a  large  degree  of  uncertainty,  especially  in  the  near-source 
region.  Therefore  unconditional  statistics  will  likely  not  produce  the  specific  pat- 
tern reflecting  a  specific  field  condition.  The  numerical  simulation  by  Graham  and 
McLaughlin  [61]  demonstrated  that  the  unconditional  concentration  moments  could 
not  capture  the  irregular  behavior  of  the  plume  in  their  case  studies.  They  suggested 
in  a  subsequent  study  [62]  that  the  use  of  site-specific  measurements  to  condition  the 
unconditional  statistics  plays  an  essential  role  in  reducing  prediction  uncertainty. 

The  process  to  inversely  identify  the  parameter  structure  in  a  distributed  sys- 
tem from  measurements  of  related  dependent  variables  is  called  inverse  modeling  or 
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parameter  estimation.  Inverse  algorithms  seek  the  most  probable,  or  maximum  a  pos- 
teriori, estimate  of  the  unknown  parameter  function  [85].  It  is  desirable  for  an  inverse 
algorithm  to  be  physically  based,  while  incorporating  all  sources  of  information,  so 
that  its  outputs  reflect  the  specific  field  situation.  Groundwater  inverse  methods  have 
been  reviewed  by  McLaughlin  [84],  Yeh  [133],  Carrera  [15],  Sun  [122],  and  McLaughlin 
and  Townley  [85].  Most  groundwater  inverse  algorithms  adopt  either  a  blocked  or  a 
continuously  distributed  description  of  spatial  variability.  The  first  approach  divides 
the  domain  of  interest  into  a  number  of  geologic  blocks  [18,  19].  Each  block  is  charac- 
terized by  a  set  of  spatially  uniform  hydrogeologic  properties  which  are  sought  in  an 
appropriate  inverse  problem.  The  second  approach  views  the  properties  of  interest  as 
stationary  random  fields  which  vary  relatively  smoothly  over  space  [67,  30]. 

Representative  groundwater  inverse  algorithms  have  been  studied  by  many  re- 
searchers during  the  last  decade  and  can  be  categorized  into  two  groups,  i.e.,  linear 
methods  and  nonlinear  methods.  Linear  methods  have  the  merit  of  being  simple  to 
use  and  reasonably  robust.  Since  the  forward  equations  describing  the  groundwater 
flow  and  transport  problems  are  generally  nonlinear,  these  type  of  methods  must 
rely  on  certain  linearizing  approximations  in  the  maximum  a  posteriori  problem  for- 
mulation. Hoeksema  and  Kitanidis  [67]  developed  a  linear  geostatistical  algorithm 
for  a  steady-state  inverse  problem  to  estimate  transmissivity  from  transmissivity  and 
steady-state  head  observations.  For  the  purpose  of  deriving  covariances,  they  as- 
sume that  the  log  conductivity  (or  transmissivity)  is  an  intrinsic  random  field  which 
can  be  adequately  approximated  by  a  discrete  block-oriented  parameterization.  Sun 
and  Yeh  [123]  extended  the  Hoeksema  and  Kitanidis  approach  to  the  dynamic  case 
using  a  discrete  adjoint  technique  to  derive  the  measurement  Jacobian  matrix.  Car- 
rera and  Medina  [17]  described  a  similar  application  of  the  adjoint  approach.  They 
found  that  the  estimation  accuracy  can  be  increased  significantly  when  multiple  head 
measurements  are  taken  over  time  at  each  sampling  location. 
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Nonlinear  methods  mimic  the  iterative  process  carried  out  in  manual  "model 
calibration"  and  "history  matching"  exercises.  The  basic  philosophy  is  to  progres- 
sively refine  an  initial  or  "prior"  estimate  until  the  fit  between  measurements  and 
predictions  can  no  longer  be  improved  statistically.  Although  nonlinear  methods 
do  not  involve  as  many  assumptions  as  the  linear  methods,  they  are  less  likely  to 
provide  unique  solutions  and  can  be  more  difficult  to  apply  in  practice  [85].  The  ex- 
isting nonlinear  methods  can  be  further  sub-divided  into  two  groups  depending  upon 
how  explicitly  they  rely  on  Bayesian  estimation  concepts  which  form  the  basis  of  the 
maximum  a  posteriori  approach.  Algorithms  that  are  directly  related  to  Gaussian 
maximum  a  posteriori  approach  have  been  presented  by  Gavalas  et  al.  [54],  and  Reid 
and  McLaughlin  [100].  This  group  of  nonlinear  methods  have  not  been  applied  ex- 
tensively in  hydrology  but  have  had  significant  impact  in  the  field  of  geostatistics. 
The  second  group  of  the  nonlinear  methods,  including  the  least  squares  or  maxi- 
mum likelihood  methods,  the  pilot  point  method,  and  the  extended  Kalman  filter, 
are  alternatives  which  do  not  directly  rely  on  the  maximum  a  posteriori  approach. 
Nonlinear  least  squares  methods  and  maximum  likelihood  methods  have  been  ap- 
plied to  distributed  parameter  groundwater  inverse  problems  since  numerical  models 
became  widely  available  in  the  1960s  and  1970s  [84,  23,  24].  Clifton  and  Neuman 
[22]  used  a  block  kriging  algorithm  to  estimate  log  conductivity  values  in  the  cells 
of  a  model  computational  grid.  The  kriged  estimates  were  derived  from  scattered 
log  conductivity  measurements  and  a  set  of  prior  statistics.  The  pilot  point  method 
approximates  the  effective  log  conductivity  by  a  smooth  function  which  reproduces 
available  measurements  while  giving  acceptable  fit  to  head  data  [80].  The  advantage 
of  this  method  lies  in  its  ability  to  produce  smooth  log  effective  conductivity  fields 
which  are  physically  reasonable. 

The  extended  Kalman  filter  is  an  optimal  recursive  nonlinear  estimator  orig- 
inating from  the  ordinary  linear  Kalman  filter  [71].   Kalman  filters  are  based  on 
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a  state  equation,  which  describes  how  the  parameters  of  interest  evolve  from  one 
measurement  time  to  another,  together  with  a  measurement  equation,  which  relates 
parameters  to  measurements.  A  Kalman  filter  processes  all  available  measurements, 
regardless  of  their  precision,  to  estimate  the  current  value  of  the  variables  of  interest, 
with  use  of  (1)  knowledge  of  the  system  and  measurement  device  dynamics,  (2)  the 
statistical  description  of  the  system  noises,  measurement  errors,  and  uncertainty  in 
the  dynamic  models,  and  (3)  any  available  information  about  the  initial  conditions 
of  the  variables  of  interest  [82].  The  recursive  nature  of  a  Kalman  filter  is  of  vital 
importance  to  the  practicality  of  filter  implementation  because  the  filter  does  not 
require  all  previous  data  to  be  kept  in  storage  and  reprocessed  every  time  a  new 
measurement  is  taken.  In  a  Bayesian  estimation  viewpoint,  the  filter  propagates  the 
conditional  moments  of  the  desired  quantities,  conditioned  on  knowledge  of  the  ac- 
tual data  coming  from  the  measuring  devices  assuming  the  states  are  jointly  normally 
distributed.  Thus  the  optimality  of  a  Kalman  filter  can  be  justified  via  the  choice  of 
the  conditional  mean,  which  is  the  best,  unbiased  estimate  to  the  quantity  of  interest, 
and  the  conditional  variance,  which  is  the  minimum  error  variance.  If  the  states  are 
not  jointly  normally  distributed  the  Kalman  filter  produces  the  best  linear  unbiased 
estimates.  Graham  and  McLaughlin  [63]  successfully  used  the  extended  Kalman  filter 
to  predict  tracer  movement  in  Borden  tracer  tests.  Other  applications  include  works 
done  by  Graham  and  McLaughlin  [62],  Van  Geer  et  al.  [129],  Zhou  et  al.  [135],  and 
Graham  and  Tankersley  [64].  The  strength  of  the  approach  relies  on  its  ability  to 
combine  the  physically  based  groundwater  flow  and  transport  models  with  field  data 
using  Bayesian  conditioning  theory.  A  Kalman  filter  also  makes  it  easy  to  incorporate 
both  system  noise  and  measurement  error.  The  extended  Kalman  filter  is  most  useful 
in  applications  where  the  number  of  unknowns  is  relatively  small  and  measurements 
are  taken  at  many  times.  Since  the  method  uses  less  information  on  each  iteration 
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than  the  maximum  a  posteriori  algorithm,  its  convergence  properties  are  pending 
further  investigation  [85]. 

1.3    Outline  of  the  Dissertation 

The  objective  of  this  research  is  to  develop  a  systematic  parameter  estimation 
algorithm  to  characterize  the  three-dimensional  distributions  of  NAPL  residual  satu- 
ration and  Darcy  flux  by  taking  advantage  of  field  measurements  of  both  partitioning 
and  nonpartitioning  tracer  concentrations.  The  spatially  distributed  parameters  of 
interest  in  an  inverse  problem  involving  a  NAPL-contaminated  aquifer  include  hy- 
draulic conductivity,  Darcy  flux,  piezometric  head,  and  NAPL  residual  saturation. 
All  of  these  parameters  are  uncertain  because  they  can  not  be  measured  in  a  de- 
tailed manner  via  either  direct  devices  or  indirect  derivation.  Focus  of  this  study  will 
be  on  estimating  the  three-dimensional  distributed  Darcy  flux  and  NAPL  residual 
saturation. 

This  dissertation  contains  five  chapters  including  this  introductory  chapter. 
Chapter  two  focuses  on  the  development  and  solution  of  the  unconditional  moment 
propagation  equations.  Chapter  three  discusses  the  formulation  and  solution  of  the 
extended  Kalman  filter  equations  and  applies  the  methodology  to  synthetic  case  stud- 
ies. Chapter  four  demonstrates  the  algorithm  using  field  data  from  an  interwell  par- 
titioning tracer  test  conducted  at  OU-1  test  cell,  Hill  AFB,  Utah.  Final  conclusions 
of  the  dissertation  are  presented  in  Chapter  five. 


CHAPTER  2 
INFERENCE  OF  UNCONDITIONAL  MOMENTS 

2.1    Introduction  and  Literature  Review 

Groundwater  contamination  by  hazardous  organic  chemicals  has  caused  grow- 
ing concern  in  recent  years.  Traces  of  various  contaminants  found  in  groundwater 
have  been  reported  in  the  literature,  such  as  chlorinated  solvents,  pesticides,  mu- 
nicipal landfill  leachates,  aromatic  hydrocarbons,  and  polychlorinated  biphenyls  [79]. 
Aquifers  in  some  regions  are  contaminated  by  non-aqueous  phase  liquids  (NAPLs) 
which  are  present  as  a  separate  phase  in  the  subsurface  environment.  The  main  fea- 
tures of  NAPL  contamination  are  its  slow  movement  under  natural  hydraulic  gradient 
and  the  long-term  persistence  due  to  NAPL's  low  solubility  in  water.  Remediation 
of  hazardous  waste  sites  contaminated  with  NAPLs,  and/or  strongly-sorbed  contam- 
inants in  a  manageable  time-frame  requires  aggressive  approaches  other  than  the  tra- 
ditional pump-and-treat  [99].  Recently,  a  study  of  innovative  remediation  techniques 
through  pilot-scale  field  experiments  has  been  carried  out  by  researchers  in  the  US 
and  Canada.  Broholm  and  Cherry  [12]  reported  the  first  pilot-scale  field  testing  of 
cosolvent  flushing  for  enhanced  recovery  of  residual  NAPL.  In  their  experiment,  a 
ternary  mixture  of  NAPLs  (10%  chloroform,  40%  TCE,  and  50%  PCE)  was  injected 
into  the  aquifer  about  5  cm  below  the  water  table.  The  test  cell  was  flushed  with  30% 
methanol  at  a  pore-velocity  of  13  cm  day'1.  Post  soil  sampling  indicated  a  recovery 
of  30%  of  the  introduced  NAPL  mixture. 

The  second  field  study  was  performed  by  researchers  from  the  University  of 
Florida  [2].  The  experiment  was  conducted  in  an  existing  NAPL  plume  at  Hill  Air 
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Force  Base  (AFB),  Utah.  A  hydraulically-isolated  test  cell  (4.3  m  x  3.5  m)  was  con- 
structed in  the  surficial  sand-gravel-cobble  aquifer,  which  was  contaminated  with  a 
mixture  of  JP-4  jet  fuel  and  chlorinated  solvents  disposed  at  the  site  [99].  During  co- 
solvent  flushing,  a  mixture  of  water  and  two  alcohols,  i.e.,  ethanol  and  n-pentanol,  was 
injected  into  the  test  cell  over  a  12-day  period.  Preliminary  analyses  indicated  a  mass 
removal  of  70%  to  90%  of  NAPL  depending  on  cell  hydrogeochemical  properties.  For 
the  purpose  of  site  characterization  and  evaluation  of  remediation  efficiency,  Annable 
et  al.  [2]  ran  a  series  of  tracer  tests  with  partitioning  and  non-partitioning  tracers 
prior  to  and  after  the  injection  of  cosolvents.  By  simultaneously  introducing  non- 
partitioning  and  partitioning  tracers  into  an  aquifer  with  entrapped  residual  NAPL 
and  analyzing  the  retardation  of  the  partitioning  tracer  using  the  tracer  breakthrough 
curves,  one  can  obtain  a  spatially-averaged  estimation  of  NAPL  residual  saturation 
for  the  region  swept  by  the  tracers  [72,  2]. 

Four  principal  processes  influence  the  transport  behavior  of  an  organic  solute  in 
groundwater,  i.e.,  advection,  dispersion,  sorption,  and  transformation  [49,  47].  Advec- 
tion  and  dispersion  describe  the  role  of  hydrodynamics  in  governing  the  rate  of  move- 
ment and  dilution  of  the  solute.  Sorption  processes  include  adsorption,  chemisorp- 
tion,  absorption  and  ion  exchange.  The  sorption  process  by  which  a  solute,  which  was 
originally  in  solution,  becomes  distributed  between  the  solution  and  the  solid  phases 
is  called  partitioning.  Solute  partitioning  results  in  the  diminution  of  liquid-phase 
concentrations  without  changing  the  total  mass  of  the  compound,  and  also  in  the  re- 
tardation of  its  movement  relative  to  groundwater  flow.  Solute  transformation  in  the 
underground  environment  refers  to  the  process  of  biodegradation,  radioactive  decay, 
and  precipitation  which  results  in  a  change  in  the  total  mass  of  the  compound,  but 
may  not  necessarily  slow  the  rate  of  solute  movement. 

One  of  the  apparent  difficulties  when  using  a  deterministic  model  to  describe 
solute  fate  and  transport  under  field  conditions  is  the  determination  of  the  parameters, 
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such  as  hydraulic  conductivity  and  dispersivity.  Since  these  models  were  originally 
developed  and  tested  at  the  laboratory  scale,  their  accuracy  has  been  questioned  if 
directly  applied  to  monitor  the  migration  of  contaminant  plume  at  field.  For  exam- 
ple, hydrodynamic  dispersion  [7],  developed  at  a  scale  of  representative  elementary 
volume  (REV),  was  used  to  model  the  solute  dispersion  process  through  a  Fickian 
relationship.  The  hydrodynamic  dispersivity,  usually  obtained  by  curve  fitting  of  so- 
lute breakthrough  concentration  in  lab  column  experiments,  is  typically  of  the  order 
of  millimeters  to  centimeters.  However,  interpretation  of  field  data  have  consistently 
suggested  that  dispersive  spreading  and  dilution  are  not  Fickian  and  may  exhibit 
three  dimensional  anisotropy  [79],  and  often  the  rate  of  spreading  increases  with  the 
solute  travel  distance  from  the  source.  A  wide  range  of  effective  (or  macroscopic, 
apparent)  dispersivities  have  been  encountered  in  the  field  and  they  are  generally  of 
orders  of  magnitude  larger  than  the  laboratory  value  [57]. 

Field-scale  transport  of  reactive  and  nonreactive  solutes  by  groundwater  is 
governed  by  the  heterogeneous  physical  and  chemical  characteristics  of  the  aquifer 
system.  The  principal  mechanism  for  enhancement  of  solute  spreading  observed  in 
the  field  tests  is  generally  believed  to  be  spatial  variability  in  the  solute  velocity  which 
is,  in  turn,  due  to  the  variability  in  hydraulic  conductivity.  In  the  case  of  a  sorbing 
solute,  the  sorption  parameter  variability  may  also  contribute  to  the  irregularity  of 
plume  movement  because  of  its  direct  impact  on  the  effective  solute  velocity  that 
is  retarded  due  to  mass  partitioning.  Uncertainty  about  the  distribution  of  aquifer 
properties  may  be  attributed  to  both  the  inability  to  obtain  detailed  measurements 
in  space  and  time  due  to  the  cost  constraints  and  the  fact  that  the  measurements 
themselves  are  error  prone. 

One  method  which  has  been  proposed  to  deal  with  hydrogeochemical  uncer- 
tainties is  the  stochastic  approach  which  treats  subsurface  flow  and  transport  prob- 
lems in  a  probabilistic  framework.  The  stochastic  methods  regard  aquifer  properties 
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as  well  as  the  dependent  variables  (e.g.,  head,  flux,  and  concentration)  as  spatial- 
temporal  random  fields  characterized  by  physically-plausible  joint  probability  distri- 
bution functions  (pdf).  The  population  (or  ensemble)  of  a  random  process  thus  con- 
sists of  all  possible  realizations  that  share  the  same  statistics  described  by  the  joint 
pdf,  and  each  realization  exists  uniquely  to  represent  a  specific  aquifer.  Although  a 
complete  characterization  of  a  random  process  requires  the  knowledge  of  the  full  joint 
pdf,  the  first-  and  second-order  moments  often  reveal  sufficient  information  to  make 
effective  predictions  and  management  decisions. 

Stochastic  research  has  experienced  a  rapid  development  over  the  last  two 
decades  with  the  focus  on  understanding  the  impact  of  the  natural  heterogeneities  on 
the  contaminant  fate  and  transport  (see  review  papers  by  Dagan  [29,  31,  32];  Gelhar 
[56];  and  McLaughlin  and  Townley  [85]).  The  stochastic  approach  has  become  in- 
creasingly attractive  to  hydrologists  and  engineers  because  of  its  ability  to  overcome 
difficulties  due  to  the  scarcity  of  data,  and  to  effectively  quantify  both  the  variability 
and  prediction  uncertainty.  A  variety  of  techniques  involving  theoretical  and  numeri- 
cal methods  have  been  investigated  along  with  several  field  studies  that  systematically 
explored  the  real-world  variability  of  hydraulic  properties.  Theoretical  development 
has  focused  on  effects  of  spatial  variable  of  hydraulic  conductivity  on  field-scale  advec- 
tion  and  dispersive  mixing  (e.g.,  in  papers  by  Freeze  [48],  Dagan  [27,  28,  30,  33,  35], 
Gelhar  [55,  56,  57],  Gelhar  and  Axness  [58],  Neuman  et  al.  [92],  Sposito  et  al.  [117], 
Dagan,  Cvetkovic  and  Shapiro  [38],  Kapoor  and  Gelhar  [75,  76],  and  Dagan,  Bellin 
and  Rubin  [37]).  Similar  theories  have  also  been  developed  to  evaluate  the  effects  of 
spatial  variability  in  the  sorptive  capacity  of  aquifer  materials  [34,  128,  26,  39,  14]. 
Representative  numerical  stochastic  models  includes  work  done  by  Tang  and  Pinder 
[125];  Smith  and  Schwartz  [113,  114];  Tang,  Schwartz,  and  Smith  [126];  Graham  and 
McLaughlin  [61,  62];  Tompson  and  Gelhar  [127];  Rubin  [101];  Bellin  et  al.  [11];  Chin 


21 


and  Wang  [21];  Dykaar  and  Kitanidis  [44];  Russo  [108,  109];  and  Russo,  Zaidel,  and 
Laufer  [110]. 

Much  of  the  work  was  focused  on  the  explanation  of  the  discrepancy  between 
field-scale  and  laboratory-scale  observations  of  solute  transport  behavior  and  can 
be  categorized  into  three  classes:  (1)  effective  parameter  estimation,  e.g.,  macro- 
dispersivity  and  effective  hydraulic  conductivity,  (2)  flow  and  solute  transport  pre- 
dictions, and  (3)  quantification  of  uncertainties  associated  with  these  predictions.  A 
common  result  indicates  that  the  macroscopic  dispersion  coefficients  are  no  longer 
constant  in  space  and  time,  and  often  several  orders  of  magnitude  larger  than  those 
determined  from  lab  experiments.  A  field-scale  test  was  performed  at  Canadian 
Forces  Base  Borden  [79,  50,  121]  in  which  migration  of  both  reactive  and  nonreactive 
tracers  were  carefully  monitored  in  a  sandy  aquifer  under  a  natural  hydraulic  gradi- 
ent. This  study  indicated  that  the  bulk  retardation  factors  describing  the  sorption 
of  reactive  solutes  may  increase  with  time.  LeBlanc  et  al.  [77]  and  Garabedian  et  al. 
[53]  reported  a  similar  field  test  at  Cape  Cod.  Both  field  experiments  have  revealed 
that  the  field-scale  dispersivity  tensor  is  anisotropic  with  the  longitudinal  dispersivity 
being  significantly  larger  than  the  laboratory  values  observed  with  the  same  aquifer 
material.  Garabedian  [52]  found  that  the  longitudinal  dispersion  of  a  reactive  solute 
is  larger  than  that  of  a  nonreactive  solute  after  same  displacement  distance. 

Methodologies  used  to  obtain  unconditional  statistics  are  broad  and  the  most 
common  one  is  the  direct  simulation  approach  represented  by  the  Monte  Carlo  sim- 
ulation. This  approach  has  the  advantage  of  generality,  but  can  be  computationally 
burdensome  [86].  It  is  usually  necessary  to  analyze  a  very  large  number  of  replicates 
in  order  for  the  ensemble  statistics  to  converge.  Another  approach  involves  the  use  of 
the  local  flow  and  transport  equations  as  the  starting  point  and  the  introduction  of 
appropriate  approximations  such  as  local  linearization  and  perturbation  expansions. 
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It  produces  closed-form,  analytical,  moment  expressions  for  simple-configurated  prob- 
lems, or  moment  equations  for  more  complex,  multi-dimensional  problems  which  must 
be  solved  numerically.  The  results  are  relatively  less  general  and  their  accuracy  relies 
on  the  condition  that  the  input  variances  must  be  "sufficiently  small."  Nevertheless, 
this  approach  is  much  less  computational  demanding,  and  has  been  widely  applied  to 
study  groundwater  flow  and  transport  in  heterogeneous  porous  medium. 

Gelhar  and  Axness  [58]  related  macroscopic  spreading  of  solute  plumes  to 
spatial  variability  of  hydraulic  conductivity.  By  using  the  spectral  perturbation  tech- 
nique, they  derived  a  macroscopic  dispersive  flux  from  the  perturbed  steady  flow 
and  solute  transport  equations.  The  macrodispersivity  tensor  was  evaluated  in  terms 
of  a  three-dimensional  statistically  anisotropic  covariance  describing  the  hydraulic 
conductivity  field.  They  found  that  macrodispersion  coefficients  depend  upon  the 
statistics  of  the  hydraulic  conductivity  field  and  other  flow  and  transport  parameters. 
The  study  was  based  on  an  assumption  that  the  concentration  field  is  locally  station- 
ary which  requires  the  mean  concentration  field  to  be  relatively  smooth.  Therefore, 
results  from  Gelhar  and  Axness  [58]  will  be  valid  only  after  a  substantial  displacement 
distance  from  the  source  where  large  concentration  gradients  and  sharp  curvature  of 
the  mean  plume  occur. 

Dagan  [28,  29],  using  a  Lagrangian  analysis  of  single-particle  displacement, 
derived  in  a  closed  form  the  time-dependent  components  of  the  dispersion  tensor  for 
isotropic,  two-  and  three-  dimensional  structure,  characterized  by  an  exponential  log- 
conductivity  covariance.  Macroscopic  solute  spreading  was  found  to  be  anisotropic 
and  non-Fickian  except  after  long  travel  distances.  Dagan's  analysis  [28,  29]  on 
macrodispersion  was  later  compared  to  those  of  Gelhar  and  Axness  [58]  by  Sudicky 
[121],  who  evaluated  field-scale  macrodispersivities  in  two-dimensions  from  spatial 
moment  analysis  of  the  Borden  concentration  data.  He  found  general  agreement  be- 
tween macrodispersivities  predicted  by  the  stochastic  theories  and  those  estimated 
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from  the  field  data.  Similar  analysis  was  also  conducted  by  Rajaram  and  Gelhar 
[97],  whose  three-dimensional  spatial  moments  analysis  of  the  Borden  Tracer  test  in- 
dicated that  the  longitudinal  macrodispersivity  at  large  time  compared  well  with  the 
predictions  of  the  stochastic  theory  of  Gelhar  and  Axness  [58].  Although  use  of  the 
Lagrangian  approach  has  been  successful  in  many  flow  and  transport  applications 
such  as  derivation  of  the  displacement  statistics  of  ergodic  and  nonergodic  plumes 
[34,  35,  36]  and  formulation  of  the  transport  problem  using  travel  time  statistics 
[40,  38,  106],  it  is  limited  by  the  difficulty  of  deriving  closed-form  solutions  for  the 
joint  displacements  of  two  or  more  particles. 

Graham  and  McLaughlin  [61]  used  first-order  Eulerian  methods  and  derived  a 
system  of  ensemble  concentration  moment  equations  from  the  local  transport  equa- 
tions for  a  conservative  solute  in  a  two-dimensional  steady-state  flow  system.  They 
assumed  that  the  velocity  fluctuations  are  caused  by  small-scale  variations  in  log  hy- 
draulic conductivity  which  is  characterized  with  a  negative  exponential  covariance 
function.  A  line  source  with  a  certain  initial  concentration  at  a  deterministic  location 
was  introduced  at  time  zero,  and  the  coupled  stochastic  partial  differential  equations 
were  then  solved  through  a  Galerkin  finite  element  algorithm.  The  advantage  of  their 
method  is  that  both  the  time-varying  concentration  mean  and  standard  deviation 
can  be  quantified.  Analysis  of  their  synthetic  examples  showed  that  the  resulting 
concentration  field  is  nonstationary  even  though  the  hydraulic  conductivity  field  is 
stationary.  Graham  and  McLaughlin  [61]  found  that  the  unconditional  concentration 
predictions  displayed  high  uncertainty  especially  in  the  near-source  region.  The  en- 
semble longitudinal  and  transverse  macrodispersive  fluxes  were  found  to  increase  with 
increasing  hydraulic  conductivity  variance  and  correlation  scale,  and  the  ensemble 
mean  plume  was  found  to  approach  Fickian  behavior  as  time  progressed.  Compari- 
son of  their  numerical  results  for  macrodispersivity  to  those  predicted  by  Gelhar  and 
Axness  [58]  and  Dagan  [29]  indicated  consistent  agreement.  Limitations  of  Graham 
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and  McLaughlin's  method  come  from  the  small  perturbation  assumption,  which  may 
not  be  valid  in  geological  formations  where  variances  of  the  log-conductivity  exceed 
one,  and  the  computational  demands  of  the  numerical  approach. 

Tompson  and  Gelhar  [127]  developed  a  numerical  procedure  based  on  a  random 
walk  particle  method  (RWPM)  to  study  detailed  movement  of  a  single  nonreactive 
solute  within  a  large,  three-dimensional,  heterogeneous  porous  flow  system  generated 
by  statistically  isotropic  hydraulic  conductivity  fields.  The  flow  field  was  established 
by  a  constant  head  drop  between  two  opposing  faces,  while  no-flow  conditions  were 
maintained  on  the  other  boundaries.  A  small  pulse  of  solute  was  released  near  the 
upstream  boundary  in  a  cubic  domain  with  about  105  nodes  or  cells.  Nonergodic 
effects  were  found  to  prevail  in  the  early  stage  of  the  plume  development.  For  a 
less  variable  conductivity  field  (e.g.,  07  =  1),  their  estimates  of  the  asymptotic  lon- 
gitudinal macroscopic  dispersivity  coefficients  from  the  numerical  simulations  were 
consistent  with  the  predictions  by  Gelhar  and  Axness  [58].  For  more  variable  fields 
(e.g.,  oj  =  1.7 or 2.3)  ,  however,  the  numerical  macrodispersivities  were  larger  than 
the  analytical  predictions.  The  authors  suggested  that  this  could  be  due  to  incomplete 
plume  development  in  a  domain  that  was  not  sufficiently  large. 

A  recent  analytic  study  by  Kapoor  and  Gelhar  [75,  76]  focused  on  the  dy- 
namics of  concentration  fluctuations.  Assuming  a  smooth  mean  concentration  field 
they  used  a  Green's  function  technique  to  derive  an  analytical  solution  for  the  con- 
centration variance  conservation  equation  obtained  using  the  perturbation  method 
for  both  isotropic  and  anisotropic,  three-dimensional  heterogeneous  porous  media. 
Kapoor  and  Gelhar  [75,  76]  showed  that  the  products  of  the  macrodispersion  coef- 
ficient and  the  squared  gradient  of  mean  concentration  field  determine  the  rate  of 
production  of  concentration  variance;  while  the  rate  of  dissipation  of  concentration 
variance  is  determined  by  the  product  of  the  local  dispersion  coefficient  and  the  mean 
squared  gradient  of  the  concentration  perturbation  field.  Their  results  also  suggest 
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the  local  dispersion  is  the  only  mechanism  to  diminish  the  concentration  variance 
created  by  the  gradients  in  the  mean  concentration  fields.  Kapoor  and  Gelhar  [75] 
approximated  the  variance  dissipation  as  a  first-order  decay  with  the  decay  coefficient 
proportional  to  the  local  dispersion  coefficient  divided  by  a  concentration  microscale 
(a  critical  length  that  characterized  the  derivatives  of  the  concentration  perturbation 
field).  Their  analytical  solution  to  the  coupled  variance  equations  revealed  that  away 
from  the  regions  of  large  gradients  in  the  variance  field,  at  large  times,  a  linear  rela- 
tionship between  concentration  variance  and  the  square  of  the  partial  derivatives  of 
the  mean  concentration  field  may  be  established.  The  authors  concluded  that  the  hy- 
pothetical zero  local  dispersion  cases  used  in  many  stochastic  models  would  lead  the 
concentration  coefficient  of  variation  to  increase  unboundedly  with  time.  Kapoor  and 
Gelhar  [75]  related  their  theoretical  results  to  the  observations  of  Cape  Cod  tracer 
test  and  pointed  out  that  that  it  was  important  to  include  the  dissipation  action  of 
local  dispersion  in  any  realistic  assessment  of  concentration  fluctuations. 

Zhang  and  Neuman  [134]  proposed  a  combined  analytical-numerical  approach 
to  the  Eulerian-Lagrangian  transport  theory  [91]  for  a  nonreactive  solute  transport 
problem  under  a  special  case  of  steady-state  flow  in  a  mildly  fluctuating,  statistically 
homogeneous,  log-normal  hydraulic  conductivity  field.  At  early  dimensionless  travel 
distance  (i.e.,  less  than  half  a  log  transmissivity  correlation  scale),  the  approach 
adopted  an  extended  Batchelor's  analytical  solution  [8]  to  calculate  the  concentration 
moments.  At  the  intermediate  dimensionless  travel  distance  (greater  than  half  of  the 
log  transmissivity  correlation  scale),  a  pseudo-Fickian  numerical  solution  method  was 
used.  One-  and  two-dimensional  unconditional  examples  were  studied  to  compare  the 
concentration  covariance  based  on  the  Eulerian-Lagrangian  theory  to  that  obtained 
from  Dagan's  closed-form  formula  for  variance  [34,  35].  Zhang  and  Neuman  [134] 
found  that  their  results  tend  to  approach  the  lower  bounds  of  that  predicted  by 
Dagan's  formula.  These  pseudo-Fickian  lower  bounds,  however,  compared  amenably 
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to  those  obtained  by  Graham  and  McLaughlin  [61,  62].  The  authors  indicated  that 
the  reliance  of  this  method  on  an  analytical  solution  at  early  time  has  a  computational 
advantage  over  Neuman's  conditional  Lagrangian  method  [91]  in  that  it  avoids  the 
need  for  Monte  Carlo  simulations  of  velocity  fields  and  particle  motions. 

Solute  transport  in  porous  formations  under  the  influence  of  spatial  variability 
of  physical  and  chemical  parameters  has  drawn  increased  attention  recently.  Sorption, 
resulting  from  an  exchange  of  solute  mass  between  the  mobile  fluid  and  immobile  re- 
gions existing  in  the  porous  medium  [13],  is  a  common  process  among  many  chemical 
reactions  between  solute  and  the  porous  material  in  the  subsurface.  In  addition  to 
physical  heterogeneities  in  hydraulic  parameters,  many  researchers  have  considered 
the  effects  of  chemical  heterogeneity  in  the  stochastic  study  of  large-scale  sorptive  so- 
lute transport  by  groundwater  (e.g.,  Dagan  [34];  Destouni  and  Cvetkovic  [43];  Gelhar 
[57];  Dagan  and  Cvetkovic  [39];  Bellin,  Rinaldo,  Bosma,  van  der  Zee,  and  Rubin  [10]; 
Burr,  Sudicky.  and  Naff  [14];  and  Bellin  and  Rinaldo  [9]). 

Dagan  and  Cvetkovic  [39]  employed  a  Lagrangian  approach  and  derived  ana- 
lytical expressions  for  the  spatial  moments  of  a  reactive  solute  plume  which  undergoes 
kinetically  controlled  sorption,  according  to  a  linear  "two-site"  sorption  model.  Hy- 
draulic heterogeneity  was  represented  by  a  conductivity  field  characterized  by  an 
axisymmetric  covariance  function,  and  the  sorptive  reaction  was  assumed  to  be  de- 
pendent on  two  constant  coefficients:  a  reaction  time  and  an  equilibrium  partitioning 
coefficient.  Dagan  and  Cvetkovic  [39]  found  that  while  the  second-order  longitudinal 
spatial  moments  depended  on  two  non-interactive  effects,  i.e.,  sorption  and  conductiv- 
ity heterogeneity,  the  second-order  transverse  moments  were  not  affected  by  sorption 
kinetics. 

Bellin  et  al.  [10]  derived  a  first-order  analytical  solution  for  the  transport  of 
reactive  solutes  in  both  physically  and  chemically  heterogeneous  porous  media.  The 
solution  relied  on  the  assumption  of  local  linear  equilibrium  sorption  characterized  by 
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a  partitioning  coefficient  that  is  spatially  variable,  postulating  the  existence  of  a  spa- 
tially variable  retardation  factor.  The  retardation  factor  and  the  conductivity  fields 
were  assumed  to  be  either  perfectly  (positive  or  negative)  correlated  or  uncorrelated. 
Bellin  et  al.  [10]  showed  that  the  functional  relationship  between  random  fields  had 
a  large  impact  on  the  second-order  longitudinal  spatial  moments.  Bellin  et  al.  [10] 
showed  that  the  spatial  variability  of  the  retardation  coefficient  affects  only  the  longi- 
tudinal dispersion,  while  transverse  spatial  moments  are  unaffected  by  the  additional 
chemical  heterogeneity,  leading  to  an  enhanced  degree  of  anisotropy  in  the  dispersion 
plumes.  Bellin  and  Rinaldo  [9]  later  extended  this  study  by  examining  the  impact 
of  imperfect  correlation  between  conductivity  and  the  partitioning  coefficient  fields. 
They  found  that  the  longitudinal  dimensionless  second  spatial  moment  is  smaller 
(larger)  for  an  imperfect  positive  (negative)  correlation  than  the  corresponding  value 
for  a  perfect  positive  (negative)  correlation,  and  that  the  difference  between  partial 
and  perfect  correlation  vanishes  as  the  geometric  mean  of  the  partition  coefficient 
increases. 

Burr  et  al.  [14]  performed  a  limited  Monte  Carlo  analysis  with  five  detailed  flow 
and  transport  simulations  for  both  reactive  and  nonreactive  contaminants  migrating 
in  a  heterogeneous,  statistically  anisotropic  media.  The  spatially  variable  hydraulic 
conductivity  was  simulated  to  have  the  statistics  similar  to  Borden  site.  For  the 
reactive  solutes,  they  assumed  that  the  logarithm  of  hydraulic  conductivity  and  the 
logarithm  of  the  distribution  coefficient  fields  were  negatively  correlated,  and  that 
they  possessed  the  same  spatial  correlation  structure.  Sorption  process  was  considered 
to  be  either  fully  equilibrium  or  fully  rate-limited.  It  was  found  that  the  ensemble 
mean  bulk  retardation  factor  increased  with  time  and  plume  displacement  distance. 
However,  for  individual  realizations,  the  apparent  retardation  factor  was  found  to 
either  increase  in  a  manner  similar  to  that  observed  during  the  Borden  tracer  test,  or 
to  decrease.  Their  numerical  results  also  demonstrated  that  the  first-order  stochastic 


28 


analyses  of  Gelhar  and  Axness  [58]  and  Dagan  [33]  tend  to  overestimate  the  field- 
scale  longitudinal  spreading  of  a  nonreactive  solute.  Although  macrodispersivity  in 
the  transverse  direction  was  not  found  to  be  significantly  affected  by  the  presence  of 
the  variability  of  the  distribution  coefficient  field,  the  longitudinal  macrodispersivity 
of  a  reactive  solute  was  found  to  be  much  more  enhanced  than  a  nonreactive  one  for 
the  negative  correlation  case  considered  here.  This  led  the  authors  to  suggest  that  the 
spatial  variability  of  the  distribution  coefficient  field  imparts  an  additional  variability 
on  the  local,  reactive  solute  velocity,  and  hence  that  different  macro-dispersivities 
may  have  to  be  defined  for  different  reactive  solutes.  Burr  et  al.  [14]  also  found  that 
the  concentration  uncertainty  for  the  reactive  solute  was  comparable  to  that  for  the 
nonreactive  one.  They  attributed  this  phenomenon  to  the  enhanced  plume  spreading 
for  the  reactive  solute.  Since  this  work  was  based  on  only  five  realizations,  its  findings 
are  subject  to  further  investigation. 

The  focus  of  this  chapter  is  placed  on  the  understanding  of  the  field-scale  trans- 
port of  a  sorbing  solute  (or  partitioning  tracer)  through  the  study  of  unconditional 
concentration  moments  in  a  three-dimensional,  heterogeneous  saturated  aquifer  with 
nonaqueous  phase  liquids  (NAPL)  present  in  an  immobile,  residual  phase.  Both  the 
initial  tracer  concentration  and  the  partitioning  coefficient  are  assumed  to  be  known 
perfectly.  The  unconditional  hydraulic  conductivity  and  volumetric  NAPL  content 
are  assumed  to  be  log-normally  distributed  with  a  statistically  isotropic,  stationary, 
exponential  correlation  structure.  Local  linear  equilibrium  sorption  is  assumed  be- 
tween the  partitioning  tracer  and  the  residual  NAPL.  Thus  the  presence  of  residual 
NAPL  leads  to  a  three-dimensional  spatially  variable  retardation  factor. 

The  impact  of  the  heterogeneity  of  NAPL  as  an  additional  factor  to  the  hetero- 
geneity of  conductivity  on  the  prediction  of  partitioning  tracer  concentration  moments 
is  evaluated  in  this  chapter.  Unconditional  concentration  moments  reveal  consider- 
able insight  about  the  way  different  sources  of  uncertainty  interact,  and  provide  a 
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good  measure  of  the  degree  of  the  correlation  between  solute  concentration  and  other 
random  variables.  This  information  is  highly  valuable  to  the  design  of  reliable  sam- 
pling networks  for  site  characterization  and  remediation. 

The  stochastic  approach  used  in  this  research  is  similar  to  that  described  by 
Graham  and  McLaughlin  [61]  except  that  a  reactive  solute  in  a  three-dimensional 
domain  is  considered  rather  than  the  two-dimension  transport  of  a  nonreactive  so- 
lute that  was  considered  in  their  study.  Using  an  Eulerian  perturbation  method,  a 
system  of  six  coupled  partial  differential  moment  equations (SPDE)  are  derived  from 
the  traditional  advection-dispersion-retardation  equation.  These  equations  describe 
the  transient  behavior  of  the  concentration  moments  which  explicitly  accounts  for  the 
effects  of  the  considered  heterogeneities  on  the  large-scale  dispersion  and  prediction 
uncertainty.  A  solution  technique  called  "square-root  "  decomposition  method  [82] 
is  introduced  to  reduce  the  computational  time  and  memory  required  to  solve  the 
problem.  The  moment  equations  are  then  solved  numerically  with  a  finite  difference 
algorithm.  Five  synthetic  cases  are  presented  in  this  chapter  to  evaluate  the  impact 
of  the  various  statistical  parameters  on  the  model  predictions  based  on  different  sta- 
tistical relationships  between  the  random  fields  of  In  conductivity  and  In  volumetric 
NAPL  content.  These  five  cases  are  (1)  uncorrelated  In  conductivity  and  \nOn,  (2) 
weakly  positively  correlated  In  conductivity  and  ln#n,  (3)  weakly  negatively  corre- 
lated In  conductivity  and  \n9n,  (4)  perfectly  positively  correlated  In  conductivity  and 
ln#„,  and  (5)  perfectly  negatively  correlated  In  conductivity  and  ln#n. 


The  advective-dispersive  transport  equation  for  a  sorbing  solute  in  a  fully 
saturated  porous  medium  is  [49,  47,  34] 


2.2    Theory  Unconditional  Moment  Equations 


x  e  D 


(2.1) 


(2.2) 


qc       (  qxcb   for     0  <  t  <  tb,    x  6  dDin 
qxc  -  ewDn  —  =  <  0       for     t  >  tb,    x  e  dDin 
dXl      {  qxc    for     t,    x  €  dD0Ut 
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(2.3) 


^— =  0    for    *,    x  G  dDy  (2.4) 

9c 


=  0    for         x  e  5D2  (2.5) 


where  the  solute  concentration  c(x,  i)  ([ML-3])  and  the  sorbed  solute  5  (mass  solute 
per  unit  volume  NAPL)  are  assumed  to  be  a  nonstationary  random  field  of  location 
x  (xi,  i  =  1,2, 3.)  and  time  t.  Bold  face  letters  are  used  to  indicate  vector  and  tensor 
quantities.  Equation  (2.1)  through  (2.5)  are  defined  over  a  finite  three-dimensional 
domain  D  with  boundary  <9D,  hence  all  the  indices  take  the  values  of  1,  2,  and  3. 
Einstein's  convention  of  summation  is  implied.  In  this  study  the  volumetric  water 
content  6W  is  assumed  to  be  a  deterministic  quantity,  uniform  in  space,  and  invariant 
in  time.  The  local  dispersion  tensor  Dij  ([L2T~1])  is  also  assumed  to  be  deterministic 
and  related  to  the  mean  pore  velocity  as  follows  [7]: 

V'V ' 

=  aTvSij  +  (aL  -  arj-r1  (2.6) 

in  which  c*t  and  are  the  spatial  constant,  temporally  invariant  transverse  and 
longitudinal  local  dispersivities;  v  is  the  magnitude  of  the  mean  pore  velocity  vector, 
Vi  and  vj  are  the  components  of  pore  water  velocity  in  i  and  j  directions.  Molecular 
diffusion  is  neglected  because  the  domain  is  within  the  field-scale,  and  thus  hydrody- 
namic  dispersion  will  dominate  molecular  diffusion.  The  initial  condition,  in  equation 
(2.2)  states  that  at  time  zero,  the  concentrations  in  domain  D  is  zero  or  at  some  known 
level  Co- 

Boundary  conditions  are  summarized  in  equations  (2.3)  through  (2.5).  In  this 
study,  a  constant  flux  is  maintained  longitudinally  through  the  upstream  boundary 
Din,  and  leaves  the  domain  through  the  downstream  boundary  Dout.  A  partitioning 


31 


tracer  is  injected  through  Din  for  a  length  of  time  tb.  At  boundaries  parallel  to  the 
mean  flow  direction,  i.e.,  d~Dy  and  dDz,  a  zero  mass  flux  boundary  condition  is  each 
implied  applicable  by  equation  (2.4)  and  (2.5),  respectively.  In  addition,  the  initial 
concentration  c0,  injection  concentration  cb,  and  time  of  injection  tb  are  regarded  to  be 
deterministic  constants.  The  Darcy's  fluxes  ([L3L_2T-1]),  i=l,2,  and  3,  in  equation 
(2.1)  are  assumed  to  be  time-invariant  random  fields.  Similarly  the  NAPL  content  9n 
is  assumed  to  be  a  time-invariant  random  field  which  is  related  to  the  NAPL  residual 
saturation  Sn  according  to 

S 

0n  =  z  ^-Qw  (2.7) 

t   —  <->7l 

The  sorption  process  is  postulated  to  observe  a  linear  equilibrium  isotherm, 

i.e., 

S  =  KNc  (2.8) 

substitution  of  equation  (2.7)  and  equation  (2.8)  into  equation  (2.1)  leads  to 

dC-         x  e  D  (2.9) 

The  random  fields  are  each  expanded  into  the  sum  of  a  mean  and  a  small 
perturbation  around  the  mean,  i.e., 


£[(*.+ +  £ 


c(x,t) 

=  c(x,  t)  +  <5c(x,  t), 

E[c]  =  c, 

E[Sc]  =  0 

(2.10) 

9t(x) 

=  ql{x)  +  6ql(yi), 

E[qi]  =  qt, 

E[Sqi]  =  0 

(2.11) 

ln0n(x)  =  r(x) 

=  j/(x)  +  Sy(-x), 

E[\ndn]  =  y, 

E[Sy]  =  0 

(2.12) 

Substitution  of  equations  (2.10),  (2.11),  and  (2.12)  into  the  governing  equa- 
tions gives 

-  [9W  +  KN  e^}  [c(x,  t)  +  <Jc(x,  t)}  = 

+  ^:{^^:Wx'*)  +  <Jc(x'<)]}'  xeD 
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c(x,t)  +  Sc(x,t)  =  c0,       xGD,       t  =  t0  (2.14) 


[91  (x)  +  6qx]  [c(x,  t)  +  <5c(x,  t)]  -  ^Z?n ^-  [c(x,  i)  +  5c(x,  *)] 


{ 


[<jfi(x)  +  ^i(x)]cb  for  0  <  t  <  tb,  x  G  dT>i 
0  for     i  >  tb,    x  G  9Dt„ 


(2.15) 


Q 

[c(x,  0  +  5c(x,  t)]  =  0,  for    t,    x  e  dDout  (2.16) 


3xi 
d 


Ox- 
d 


[c(x,  *)  +  6c(x,  t)]  =  0,  for     t,    xGdDy  (2.17) 


[c(x,  *)  +  <5c(x,  *)]  =  0,  for     <,    xGdD2  (2.18) 


dx3 

If  the  small-perturbation  assumption  is  applied  to  the  random  field  ln0n,  t  hen 
the  exponential  term  in  the  left-hand  side  of  equation  (2.13)  can  be  approximated  by 
a  Taylor  series  expansion,  i.e.,  eSy  =  1  +  Sy  +  •  ■  ■  .  Neglecting  terms  of  order  0(y2), 
equation  (2.13)  can  be  rewritten  as 

-  [9W  +  KNe*  +  KN     ■  5y(x)}  [c(x,  t)  +  Sc(x,  t)]  a 

-  —  {  [ft  (x)  +  Sqi  (x)]  [c(x,  t)  +  c$c(x,  * )]  } 

+  a^r™ DyaT I5(x' *}  +  5c(x' ^ )'  x  e  D 

The  ensemble  concentration  mean  equations  are  obtained  by  taking  the  ex- 
pected value  of  equations  (2.19),  (2.14),  (2.15),  (2.16),  (2.17),  and  (2.18): 

{ew  +  KNe*)°^  «  -{KNeS)^[Pyc{x,x,t)]  -  A[ft(x)c(x,*)] 


5  r„  ,     kl  a 


[P,iC(x,x,i)]+^-[^Ai^^J,       x  G  D  (2.20)  ^ 

c(x,*)  =  co,       xgD,    t  =  t0  (2.21) 
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<?i  (x)c(x,  t)+PqiC(x,  x,  t)  -  8WDU 


dc(x,  t) 


qi(x)cb   for     0  <t  <tb,    x  G  dD^ 


0 


for     t  >  tb,    x  G  <9Dj 


(2.22) 


[c(x,  £)]  =  0,     for,     t,     x  G  3D0Ut 


(2.23) 


dx^ 


[c(x,  *)]  =  0,     for,     £,    x  G  dDj 


(2.24) 


_d_ 
D.v, 


[c(x,  t)]  =  0,    for,    t,    x  G  dD2 


(2.25) 


The  first-order  concentration  perturbation  equation  can  be  derived  by  sub- 
tracting the  mean  equations  (2.20),  (2.21),  (2.22),  (2.23),  (2.24),  and  (2.25)  from 
equations  (2.19),  (2.14),  (2.15),  (2.16),  (2.17),  and  (2.18),  and  neglecting  terms  in- 
volving the  products  of  perturbations: 


d_ 

dxi 


0 


dt  dxi 
d 


0wDtJ -^tH*,  t)\  -  ^  [%(x)c(x,  t)},    x  G  D  (2.26) 


6c(x,t)  =0,       x  G  D,    t  =  tD 


(2.27) 


0 

qi  (x)<Sc(x,  t)+Sqi  (x)  c(x,  t)  -  —  [(Jc(x,  t)l 

5gi(x)c6  for  0  <  i  <  t6,  x  G  9D2„  (2"28) 
0  for     t>  tb,    x  G  3Din 


[<Jc(x,*)]  =0,    for,    t,    xedD0Ut  (2.29) 


3xi 
9 


dx2 


[<Jc(x,  *)]  =  0,    for,    t,    x  G  dDy  (2.30) 
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dx- 


[6c(x,t)]=0,    for,    *,    xGdD2  (2.31) 


Notice  that  both  the  mean  concentration  and  the  concentration  perturbation 
equations  have  the  general  structure  of  the  advective-dispersive  transport  equation. 
The  mean  concentration  equation  (2.20)  involves  two  additional  terms,  one  of  which 
is  the  divergence  of  the  cross-covariance  PqiC  evaluated  at  the  same  locations  for  both 
the  concentration  perturbation  and  the  flux  perturbation.  An  equation  for  the  more 
general  cross-covariance  P9jC(x,  x')  which  accounts  for  perturbations  at  two  different 
locations  can  be  obtained  by  pre-multiplying  the  concentration  perturbation  equa- 
tions (2.26),  (2.27),  (2.28),  (2.29),  (2.30),  and  (2.31)  at  location  x  by  the  flux  pertur- 
bation Sqi{x')  at  location  x'  and  taking  expectations.  The  resulting  (approximate) 
cross-covariance  equation  for  PqiC  is 

-  (^)p,iy(x',x,*)^M  _  Afe(x)P9ic(x')M)] 


_d_ 

dxj 


d 


—     *\  p  /V  -v^i  (2-32) 


6wDjkdx~k  p*c(x'!  x'  *)J  "  dx~  Wx>  *)  pw  (x''  XM ' 

x',x  G  D 


P9iC(x',  x,  t)  =  0,       x',  x  G  D,    t  =  t0  (2.33) 


qx  (x)P,iC(x',  x,  t)  +  Pqm  (x',  x)c(x,  t)  -  9wDn  ^  [P9lc(x',  x,  t)] 

Pqm(x',x)cb   for  0<t<tb 
0  for    £  >  £b 

x  G  3Din,    x'  G  D 


(2.34) 


for    z  =  1,  2,  3. 


dxi 


[P,iC(x',x,t)]  =0,       x  G  3Douf,    x'  G  D 

for    z  =  l,2,3 


(2.35) 
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[P9iC(x',x,t)]  =0,      x  £  3Dy,  x'gD 


&2L,,tv'"J       '   "  (2.36) 

for    i  =  1,  2,  3 


[P9.c(x',x,t)]  =0,       xe9D2,  x'gD 


3x3  (2.37) 

for    i  —  1,  2,  3 

The  second  additional  term  in  the  mean  concentration  equation  (2.20)  is  the 
time  derivative  of  Pyc(x,  x,  t)  which  is  the  cross-covariance  between  the  random  NAPL 
residual  content  and  concentration  perturbations  at  the  same  location  x  at  time  t. 
The  equation  for  Pyc(x',  x,  t)  can  be  derived  similarly  to  the  equation  for  P9iC(x',  x,  t) 
by  pre-multiplying  equations  (2.26),  (2.27),  (2.28),  (2.29),  (2.30),  and  (2.31)  by  tfy(x') 
and  taking  the  expected  value: 

(ew  +  KNe^)dPyc{^X,t)  * 

-  (^)PTO(X',X)^^  -  A[g.(X)Pyc(X')X,t)] 


+ 


x,  x  e  D 


P„c(x',  x,  t)  =  0,       x',x  G  D,    t  =  t0  (2.39) 


9i  (x)P!/c(x  ,  x,  £)  +  P ^  (x',  X 

Pj,9l(x',x)c6   for    0  <  t  <  tb 
0  for    t  >  tb 

x6dDin,  x'gD  (2-4°) 
for    i  =  1,2,3. 

9 


[Pvc(x',  x,  £)]  =0,       x  G  9D0Wt,    x'  G  D 


0*1  L  (2.41) 

for    i  =  1,2,3 


d 

dx2 


_d_ 

dx3 


Pyc{x',  x,  t)}  =  0,       x  G  dBy,    x'  G  D 

for    2  =  1,2,3 

[Pyc(x',x,*)]  =0,       xG9D2,  x'gD 


for    i  =  1,2,3 
The  concentration  auto-covariance  is  denned  as  follows: 
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(2.42) 


(2.43) 


Pcc(x',x,t)  =  Sc{x',t)  Sc(x,t) 


(2.44) 


and  the  derivative  of  Pcc(x',  x,  t)  with  respect  to  time  is 


—Pcc{x!,x,t)  =  8c(x',t) 


0_ 
Of 


5c(x,  t) 


+ 


d_ 

dt 


Sc{x',  t) 


5c(x,t)  (2.45) 


Where  overline  in  equations  (2.44)  and  (2.45)  represents  the  operation  of  taking  the 
expected  value.  Note  that  the  temporal  derivatives  of  concentration  perturbation  at 
locations  appearing  in  equation  (2.45)  are  defined  by  equation  (2.26).  The  resulting 
approximate  concentration  covariance  equation  is 

(KNe*)Pcy(x\x,t)C-^-  -  ^-[qi(x)Pce(x',x,t)] 


+ 


d_ 
dxt 


0 


dt  dx.i 
d 


9wDlJ~Pcc(x',  x,t)  -  —  [c(x,  t)Pcqi(x',  x,  t)] 


dxi 


{KN  ey)Pcy(x,  x',  t)&±&  -  A  [ft(x')Pcc(x,  x',  t)] 


+ 


9 


@wDij  ^ ^c(-X,  X  ,  f) 


dt  dx\ 
d_ 
dx 


T[c(x',t)Pcqt(x,x',t)} 


x',x  G  D 


(2.46) 


Pcc{x',x,t)  =  0,       x',xgD,    i  =  i0 


(2.47) 


<7l  (x)^cc 

(x',  x,  t)  +  Pcqi  (x',  x,  *)c(x,  *)  -  6wDn—Pcc(x',  x,  t) 

PC9l(x',x,t)c6   for  0<t<tb 
0  for    £  >  £6 

x  G  dDin,    x'  G  D 


(2.48) 
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0 

9i(x')Pcc(x,x',i)  +  PC9l(x,x',i)c(x',t)  -  9wDn—Pcc(x:x',t) 

Pcqi  (x,  x',  t)  cb   for    0  <  t  <  t\, 
0  for    t>tb  (2.49) 

x'  e  <9Din,   x  g  D 

Pcc(x\  x,  i)  =  0,    x  G  5Dout,    x'  G  D  (2.50) 


9xi 


ax; 

3 


dx2 
d 


rPcc(x,x',t)  =  0,    x'GdDout,    xGD  (2.51) 
Pcc(x',  x,  i)  =  0,    x  G  dBy,    x'  G  D  (2.52) 


r  Pcc(x,  x',  t)  =0,    x'  G  dDy,    x  G  D  (2.53) 


■  \ 

Pcc{x',x,t)  =  0,    x  G  dBz,    x'  G  D  (2.54) 


dx3 
3 


fe7Pcc(x,x',t)=0,    x'e9D2,    xGD  (2.55) 

where  equations  (2.48)  through  (2.55)  are  the  boundary  conditions  for  equation  (2.46). 
The  concentration  auto-covariance  equations  explicitly  contain  derivatives  with  re- 
spect to  two  sets  of  coordinate  systems  x  and  x'  which  complicates  its  solution. 

The  ensemble  mean  concentration  equation  ((2.20)-(2.25))  ,  together  with  the 
auto-(cross-)  covariance  equations  (equations  (2.32)-(2.37)  for  P9iC,  i  =  1,2,3;  equa- 
tions (2.38)-(2.43)  for  Pyc;  and  equations  (2.46)-(2.55)  for  Pcc),  form  a  system  of  six 
coupled  stochastic  partial  differential  equations.  These  equations  describe  the  tempo- 
ral propagation  of  the  first  and  second  moments  of  the  concentration  field  and  depend 
on  the  first  and  second  moments  of  the  input  parameters,  i.e.,  the  Darcy  flux  and 
the  residual  NAPL  field.  Note  that  similar  to  the  mean  concentration  equation,  the 
ensemble  covariance  equations  also  have  the  structure  of  the  traditional  advection- 
dispersion  equation.  These  equations  are  strictly  valid  under  the  small  perturbation 
assumption. 
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2.3    Solution  Methods 

The  covariance  equations  for  Pcc{x-',  xj  t),  PqiC{x',  x,  t),  i=l,2,3,  and  Pyc(x.',  x,  t) 
derived  in  the  Section  2.2  generally  account  for  two  different  location  vectors  denoted 
by  x',  and  x.  A  square- root  decomposition  method  [82]  is  introduced  to  simplify  these 
equations  so  that  they  are  a  function  of  only  one  location  variable  x  (see  Appendix 
B  for  details).  The  advantage  of  the  decomposition  method  is  that  it  greatly  reduces 
computer  run  time  and  data  storage  needed  to  solve  the  problem. 

The  partial  differential  equations  for  the  square  roots  of  the  covariances  are 
then  solved  with  a  three-dimensional,  block-centered,  finite  difference  algorithm  which 
is  second-order  central  differencing  in  space,  first-order  backward  (fully  implicit)  in 
time.  An  iterative  LSOR  solver  is  implemented  in  the  algorithm  whose  development 
is  illustrated  in  Appendix  E.  The  covariance  matrix  is  obtained  at  designated  output 
times  through  recomposition  of  the  square-root  solutions. 

2.4    Case  Studies 

In  this  section,  the  unconditional  moment  equations  derived  above  are  solved 
with  respect  to  a  partitioning  tracer  migrating  through  a  three-dimensional  rectan- 
gular box  located  in  a  saturated,  heterogeneous  aquifer  with  unknown  spatial  distri- 
bution of  residual  NAPL.  Five  different  assumptions  for  the  statistical  correlations 
between  the  In  hydraulic  conductivity  field  and  the  In  volumetric  NAPL  content  field 
are  considered. 

To  reduce  the  computational  demand  of  the  algorithm,  the  mean  concentration 
equations  ((2.20)  through  (2.25))  were  solved  to  first  order.  Thus  the  resulting  mean 
concentration  c(x,  t)  does  not  contain  effects  of  second-order  terms,  i.e.,  a  spatial 
derivative  term  ^f-  -P9iC(x,  x,  t),  and  a  temporal  derivative  (KN  es)^Pyc(x,  x,  t).  It 
is  expected  that  these  second-order  effects  should  be  small  in  a  small-scale,  short- 
duration,  interwell  partitioning  tracer  test. 
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2.4,1    Definition  of  the  problem 

Figure  2.1  is  a  schematic  diagram  of  the  three-dimensional  simulation  domain 
(3.556  m  x  3.556  m  x  1.524  m)  located  in  a  saturated  aquifer.  The  simulation 
domain  is  subdivided  into  a  14  x  7  x  7  grid  system  (Figure  2.2).  The  steady-state 
flow  condition  and  solute  release  scheme  are  designed  in  such  a  way  to  resemble  the 
Held  experimental  conditions  at  OU-1  test  cell  in  Hill  AFB.  Utah  [2].  A  partitioning 
tracer  (with  a  partitioning  coefficient  KN  of  9.0)  is  introduced  from  the  upstream 
face  of  the  box  at  a  fixed  concentration  q,  for  a  certain  period  of  time  f&  (0.1  days). 
Neumann  boundary  conditions  are  assumed  to  apply  on  all  sides  of  the  domain  except 
the  inflow  boundary  where  a  third  type  boundary  condition  is  adopted. 

The  flow  field  is  assumed  to  be  at  steady  state  with  the  mean  flow  direction 
parallel  to  the  x  direction.  For  the  unconditional  simulation,  the  means  of  the  flux 
components  in  the  transverse  directions  (qy  and  qz)  are  assumed  to  be  zero,  while 
in  the  mean  flow  direction,  qx  is  1.006  m/day  which  reflects  a  longitudinal  hydraulic 
gradient  (J)  of  0.05893. 

Table  2.1  lists  all  of  the  parameters  used  in  the  example  problems.  These 
parameters  are  selected  to  approximately  resemble  the  field  tracer  experiment  con- 
ducted by  Annable  et  al.  [5].  The  hydraulic  conductivity  is  assumed  to  be  locally 
isotropic  and  has  an  unconditional  mean  of  17.1  m/day.  Local  dispersivity  is  assumed 
isotropic  and  takes  the  value  of  10  cm.  The  volumetric  water  content  (8W  =  0.21)  is 
uniformly  distributed  and  temporally  invariant.  It  is  further  assumed  that  the  volu- 
metric water  content  (6W),  dispersivities  (aL  and  ar),  tracer  initial  concentration  (c{,), 
time  of  injection  (tb),  tracer  partitioning  coefficient  (KN),  and  the  mean  hydraulic 
gradient  (J)  are  all  known  perfectly. 

Using  a  steady-state  spectral  approach  described  by  Gelhar  [56,  57],  the  flux 
spectral  density  function  for  three-dimensional  steady-state  flow  is  (after  Gelhar  and 
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qxc-ewDx(dc/dx)=q«cb ,  0<t<0.1d 


(0,0,0)  x 


Figure  2.1:  Schematic  diagram  of  the  three-dimensional  simulation  domain  and 
boundary  conditions. 


Axness  [58]) 

SqiQj(k)  =  KpmJn  (<5im  -  (sjn  -  Sff(k)  (2.56) 

where  k  is  a  wave  number  vector,  Ji  is  the  hydraulic  gradient  in  the  i-th  direction, 


Kg  is  the  geometric  mean  of  the  hydraulic  conductivity  K(x),i-e.  Kg  —  exp  [ln[if  (x)]]  , 
and  S/f  is  the  spectral  density  function  for  In  if.  In  this  work,  it  is  assumed  that 
the  In  K  field  is  statistically  isotropic  with  the  following  negative  exponential  spectral 
density  function: 

Sff(k)  =  f-  ,  (2.57) 

//V        7T2  (1  +  A;2 A2)2  V  ' 

To  obtain  a  closed-form  flux  covariance  function  Pqiqj  (s1;  s2,  s3)  ,  equation  (2.57)  is 
substituted  into  equation  (2.56)  and  the  inverse  Fourier  transform  of  the  resulting 
flux  spectral  density  function  SqiQj(ki,  k2,  k3)  is  taken.  The  derivation  of  the  flux 
covariance  functions  are  detailed  in  Appendix  A.  It  is  worth  noting  that  the  flux 
covariance  functions  are  anisotropic  even  though  the  input  In  K  covariance  is  isotropic. 
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Mean  Flow  Direction 


Figure  2.2:  Finite  difference  mesh  for  the  example  problem. 

This  is  due  to  the  fact  that  the  mean  hydraulic  gradient  establishes  a  preferential 
flow  direction  that  affects  the  magnitude  of  the  correlation  between  any  two  points 
in  space  along  different  directions  [61].  Consequently,  it  is  reasonable  to  expect  the 
concentration  covariance  function  to  be  anisotropic  as  a  result  of  the  input  anisotropy 
of  Pqiqj(si,  s2,  s3),  and  of  the  advection  term  in  the  transport  equation. 

The  unconditional  random  field  of  logarithm  of  volumetric  content  of  NAPL 
(ln#„)  is  also  assumed  to  be  second-order  stationary  and  statistically  isotropic.  The 
spatial  distribution  of  the  residual  NAPL  is  expected  to  be  related  to  the  spatial 
distribution  of  hydraulic  conductivity  due  to  the  multiphase  flow  processes  that  lead 
to  NAPL  entrapment.  The  amount  of  NAPL  residing  in  porous  formation  depends 
on  NAPL- water  hysteresis  properties.  In  a  water- wet  system,  NAPL  tends  to  occupy 
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Table  2.1:  Input  parameters  for  the  unconditional  simulations. 


Parameter 

Value 

Mean  hydraulic  conductivity  K„ 

x  v  x  vfiii    xx  y  vx  x  cx  vx  x  x  v     v>  v/xx  v_x  vx  v_/  vj  x  v  x  u  y  (1 

17.1  m/dav 

Partitioning  coefficient  i^T/v 

9.0 

Longitudinal  (r>r) 

1  J  V/ll  w  X  VJ  VX  VXX  XX  VXrX     I  V  '  /  1 

0.1  m 

Transverse  disnersivitv  (o/t) 

x.  x  xxxxl-J  v  v^  x  uv    vi  l  u      v>xkjx»xvjy     \        /  i 

0.1  m 

Initial  concentration  Ch 

1.0  e/m3 

-L.W  1(1/ 

Duration  of  tracer  input 

0.1  days 

Volumetric  water  content  f?„„ 

V  V  7  J.  IX  XIX      VJ  X  X  V>      ¥  T  IX  VJ  V  t  x     v>  vy  XX  UVli  VJ     v  *)  i  I 

0.21 

Mean  hvdraulir  gradient  .7 

X VlijUill     XXV  VXX  CX  1111  V       fS     *x VX lv-11  VJ  t/ 

0  05893 

Domain  size  lx  x  ly  x  lz 

3.556  m  X3.556  m  x  1.524  m 

Unconditional  mean  of  ri„ 

V_y  llvV/llVXl  (J  X  V/  XX  CXI    XXX  V  *  XXX X     V  /  X  VX 

1  00584  m/dav 

x  *  w  v/  w  Vw/  x    xxx  /  vx xx  y 

Unconditional  mean  of  o., 

v_y  ll\_iVyil\Xl  VJ  X  VJ*  XX  VXX    XXXV/UiXX     V  /  J.     W  7/ 

0.0 

Unconditional  mean  of  gz 

0.0 

Unconditional  mean  of  Ln6n  (y) 

-4.62 

Unconditional  variance  of  Ln6n  {(Jy2) 

0.746 

Unconditional  variance  of  LnK  (a/2) 

1.0 

In  conductivity  correlation  length  Ay- 

0.3048  m 

Correlation  length  of  Ln6n  (\y) 

0.3048  m 

Grid  size,  Ax,  Ay,  Az 

0.254  m,  .508  m,  0.218  m 

Number  of  grid,  Nx  x  Ny  x  Nz 

11  x  7  x  7 

grid  size,  Ax  x  Ay  x  Az 

0.254  x  0.508  x  0.218  m3 

large  pores,  which  typically  represents  greater  saturated  hydraulic  conductivity  zones, 
while  in  a  NAPL-wet  system,  more  NAPL  is  retained  in  small  pores  with  presumably 
lower  hydraulic  conductivity.  Furthermore,  the  effective  ability  of  a  porous  formation 
to  transmit  water  should  be  affected  by  the  presence  of  residual  NAPL.  With  some 
portion  of  pore  space  occupied  by  NAPL,  the  actual  pore  space  available  to  transmit 
water  is  reduced,  and  hence  the  hydraulic  conductivity.  The  amount  of  this  reduction 
in  hydraulic  conductivity  depends  upon  the  amount  of  NAPL  remaining  in  residual 
phase.  Quantification  of  this  complex  correlation  between  hydraulic  conductivity 
and  NAPL  residual  saturation  is  beyond  the  scope  of  this  dissertation.  Therefore, 
a  simplified  relationship  is  proposed  to  describe  the  possible  statistical  correlation 
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between  these  two  random  fields: 

\n9n  =  y  +  Sy  =  y  +  tpf+  ^1  -  p2  r)  (2.58) 

where  C  =  ay/vf  »  V  1S  a  zero-mean  random  field  that  is  not  correlated  with  /,  and  p  is 
a  weighting  factor  (  —  1.0  <  p  <  1.0)  which  determines  the  degree  of  the  correlation 
between  the  residual  NAPL  and  conductivity  fields. 

Equation  (2.58)  implies  that  the  In  hydraulic  conductivity  is  likely  to  be  corre- 
lated to  some  degree  with  the  In  volumetric  NAPL  content,  but  such  a  relationship  is 
unlikely  to  be  perfect  so  that  an  additional  random  term  rj  is  included.  The  hydraulic 
conductivity  K  considered  here  represents  an  effective  conductivity  with  the  presence 
of  residual  NAPL  and  is  assumed  to  be  invariant  throughout  the  tracer  test.  If  p  is 
greater  (less)  than  zero,  equation  (2.58)  represents  a  positive  (negative)  correlation 
between  In  if  and  ln#n,  which  means  that  more  NAPL  is  expected  to  occur  in  regions 
where  hydraulic  conductivity  is  large  (small).  Note  that  when  p  =  0,  equation  (2.58) 
represents  a  uncorrelated  \n9n  and  In  if  fields;  and  when  p  =|  ±1  |,  equation  (2.58) 
represents  perfect  positive  (negative)  correlation  between  these  random  field.  In  Ap- 
pendix A,  covariance  functions  Pq.y  and  Pyy  are  derived  based  on  equations  (2.57) 
and  (2.58).  These  covariances  remain  unchanged  as  the  unconditional  concentration 
moments  propagate  in  time. 

In  the  next  section,  five  synthetic  examples  will  be  discussed.  They  are:  uncor- 
related case  (p  =  0),  weakly  (positively  and  negatively)  correlated  cases  (p  =  ±0.25), 
and  perfectly  (positively  and  negatively)  correlated  cases  (p  =  ±1).  Under  some 
cases,  two  special  situations  will  also  be  examined:  (1)  hydraulic  conductivity  is  the 
only  random  field,  NAPL  is  assumed  to  be  spatially  invariant;  and  (2)  the  residual 
NAPL  saturation  is  assumed  to  be  random,  hydraulic  conductivity  is  spatially  in- 
variant. The  parameters  used  in  these  cases  are  listed  in  Table  2.2.  The  statistical 
parameters  for  In  9n  fields  are  based  roughly  on  a  statistical  analysis  of  the  variograms 
of  Sn  estimated  by  Annable  et  al.  [3]  using  the  temporal  moment  methods  on  tracer 
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Table  2.2:  Input  parameters  for  unconditional  case  studies. 


Case 


Parameter 

1 

2 

3 

4 

5 

F 

2.839 

2.839 

2.839 

2.839 

2.839 

1.000 

1.000 

1.000 

1.000 

1.000 

A/, 

m 

0.3048 

0.3048 

0.3048 

0.3048 

/  \    OA  A  O 

0.3048 

y 

-4.6193 

-4.6193 

-4.6193 

-4.6193 

-4.6193 

0.8636 

0.8636 

0.8636 

0.8636 

0.8636 

Ay, 

m 

0.3048 

0.3048 

0.3048 

0.3048 

0.3048 

a 

-4.6193 

-4.6193 

-4.6193 

-4.6193 

-4.6193 

c  = 

:  Oy/Oj 

0.8636 

0.8636 

0.8636 

0.8636 

'/ 

0 

0 

0.8636 

0.8636 

Atj, 

111 

0.3048 

0.3048 

P 

0 

0.25 

-0.25 

1.0 

-1.0 

MLS  breakthrough  data  measured  in  the  the  cosolvent  flushing  experiment  at  OU-1 
test  cell,  Hill  AFB,  Utah.  Information  leading  to  the  statistics  of  InK  is  relatively 
limited.  The  mean  of  In  K  is  selected  to  be  the  same  as  the  value  obtained  during 
a  hydraulic  test  conducted  at  OU-1  test  cell  [112],  while  oj  and  A/  are  chosen  to  be 
comparable  to  ay  and  A^,  respectively. 

2.4.2    Unconditional  simulation  results 

Case  1  assumes  that  random  fields  In  K  and  In  9n  are  statistically  uncorre- 
cted. Figure  2.3  shows  the  unconditional  mean  concentration  c(x)  as  dashed  contour 
lines  superimposed  on  a  synthetically  simulated  single  replicate  concentration  field 
at  0.1,  0.5,  1.0,  and  1.5  days  for  an  example  horizontal  layer  which  is  0.762  m  above 
the  base  of  the  simulation  domain.  The  associated  concentration  standard  deviations 
are  shown  in  Figure  2.4.  Since  both  the  Darcy  flux  field  and  the  ln#„  field  are  as- 
sumed stationary,  and  the  input  covariances  PQiqj,  PqiV,  and  Pyy  are  all  symmetric 
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about  the  mean  flow  direction,  the  contour  lines  for  the  unconditional  mean  con- 
centration are  also  symmetric.  It  is  obvious  from  Figure  2.3  that  even  though  the 
unconditional  mean  concentration  contours  do  not  capture  the  irregular  pattern  of  the 
synthetically-simulated  single  replicate  concentration  field  (which  is  tortuous  due  to 
the  heterogeneous  hydraulic  conductivity  and  NAPL  residual  content),  the  predicted 
locations  of  the  center  of  mass  appear  to  be  good  measures  of  the  center  of  mass  of 
the  true  concentration  fields.  Figure  2.3  shows  that  the  region  of  maximum  concen- 
tration prediction  uncertainty  is  near  the  center  of  mass  of  the  mean  concentration, 
and  propagates  downstream  with  the  mean  plume. 

Figure  2.5  plots  the  concentration  prediction  uncertainty  along  a  control  line 
YAZ4  which  is  the  centroid  line  of  the  box.  Figure  2.5  clearly  indicates  that  the 
highest  concentration  prediction  uncertainty  corresponds  to  the  inflection  points  of 
the  mean  longitudinal  concentration  profile  where  the  concentration  gradients  are  the 
highest.  Since  the  mean  longitudinal  concentration  profile  generally  has  a  symmetric 
shape,  two  peaks  in  the  prediction  uncertainty  are  observed  on  either  side  of  the  peak 
mean  concentration  corresponding  to  the  inflection  points  on  the  c  profile.  Similar 
patterns  for  the  concentration  uncertainty  of  a  nonreactive  solute  in  two-dimensions 
were  found  in  the  numerical  investigations  performed  by  Graham  and  McLaughlin 
[61]  and  Rubin  [102,  103],  and  reactive  solute  in  three-dimensions  by  Burr  et  al.  [14]. 
The  magnitude  of  the  concentration  prediction  uncertainty  decreases  with  time  and 
eventually  diminishes  to  zero  as  most  of  the  solute  is  transported  out  of  the  domain. 

Since  the  mean  concentration  equations  ((2.20)  through  (2.25))  are  solved  at 
first-order,  the  temporal  derivative  term  -(KNes)-§-t  [Pj,c(x, x, t)],  and  the  spatial 
derivative  term  -~  [Pg,c(x,  x,  t)]  do  not  come  into  the  solution  of  c(x,  t).  However, 
it  is  interesting  to  evaluate  these  terms  as  they  reflect  the  impact  of  heterogeneities 
of  NAPL  and  conductivity,  respectively,  on  the  spreading  of  the  mean  plume.  Fig- 
ure 2.6  shows  (KNev)  -§-t  [Pj,c(x,x,  t)}  against  time  at  4  points  along  the  longitudinal 
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0.1  days  0.5  days 


x,  meters  x,  meters 


1 .0  days  1 .5  days 


0.0        1.0        2.0        3.0  0.0        1.0        2.0  3.0 

x,  meters  x,  meters 


Figure  2.3:  Unconditional  mean  concentration  is  shown  as  dashed  contour  lines  su- 
perimposed on  the  synthetically  simulated  single  replicate  concentration  field  shown 
as  solid  contour  lines.  Shown  above  is  horizontal  layer  Z4  (0.762  m  above  the  base) 
at  time  0.1,  0.5,  1.0,  and  1.5  days.  Case  No.l  (p  —  0) 

center  control  line  (Y4Z4),  each  located  at  0.635,  1.397,  2.159,  and  2.921  m  from  the 
inflow  boundary.  Figure  2.7  plots  the  longitudinal  derivative  of  P9lC(x,  x,  t)  against 
x  along  the  control  line  Y4Z4  at  times  0.2,  0.5,  1.0,  1.5,  and  2.0  days.  Notice  that 
the  magnitude  of  (KNes)§-t  [Pyc(x,x,t)]  varies  significantly  along  the  longitudinal 
direction,  suggesting  that  the  presence  of  NAPL  heterogeneity  has  a  direct  impact  on 
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Figure  2.4:  Unconditional  concentration  standard  deviation  at  horizontal  layer  Z4 
(0.762  m  above  the  base)  at  time  0.1,  0.5,  1.0,  and  1.5.  Case  No.l  (p  =  0) 


the  longitudinal  macrodispersion  of  the  solute  as  does  the  heterogeneity  of  hydraulic 
conductivity.  For  uncorrected  NAPL  and  conductivity  fields,  the  presence  of  residual 
NAPL  heterogeneity  is  likely  to  enhance  the  longitudinal  macrodispersion  because  the 
signs  for  the  two  terms  remain  comparably  the  same  longitudinally  at  any  time.  It  is 
also  found  that  (KNev)  ^  [Pyc(x.,x,t)]  does  not  vary  much  transversely  (not  shown 
here),  suggesting  that  NAPL  heterogeneity  has  very  limited  effect  on  the  transverse 
macrodispersion  for  a  reactive  solute. 
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Figure  2.5:  Plot  of  unconditional  concentration  standard  deviation  versus  x  along 
control  line  Y4Z4  at  time  0.2,  0.5,  1.0,  1.5,  and  2.0  days.  Case  No.l  (p  =  0) 


The  above  findings  appear  to  be  consistent  to  Gelhar's  analytical  study  [57] 
of  a  reactive  sorbing  solute  subject  to  the  heterogeneity  in  the  partitioning  coeffi- 
cient. Gelhar  [57]  suggested  that  the  uncorrelated  variation  of  partitioning  coefficient 
(relative  to  hydraulic  conductivity)  always  led  to  an  additive  effect  in  the  longitudi- 
nal macrodispersion.  While  the  sorption  variability  can  either  increase  or  decrease 
the  longitudinal  macrodispersion  depending  on  the  sign  of  correlation  between  sorp- 
tion variability  and  conductivity,  it  does  not  significantly  affect  solute  spreading  in 
the  transverse  directions  (see  papers  by  Gelhar  [57],  Bellin  et  al.  [10],  and  Burr  et 
al.  [14]).  Note  that  ^  P9lC(x,  x,  t)  is  generally  an  order  of  magnitude  greater  than 
(KNes)  ^  [Pj/C(x,x,  t)],  suggesting  that  the  randomness  of  hydraulic  conductivity  K 
is  the  primary  reason  for  the  macroscopic  spreading  of  the  ensemble  mean  reactive 
solute,  while  the  randomness  of  NAPL  field  is  of  secondary  importance. 
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Figure  2.7:  Plot  of  £  [PqiC(x,  x,  t)}  versus  x  along  control  line  Y4Z4  at  time  0.2,  0.5, 
1.0,  1.5,  and  2.0  days.  Case  No.l  (p  =  0) 
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Since  the  ultimate  goal  of  this  dissertation  is  to  estimate  the  spatial  distri- 
bution of  aquifer  hydrogeochemical  parameters  from  the  partitioning  tracer  con- 
centration measurements,  it  is  interesting  to  look  at  the  normalized  concentration 
cross-correlations  between  different  locations,  i.e.,  x,  and  x'.  These  correlations 
provide  important  information  on  how  available  concentration  data  can  be  used 
to  optimally  estimate  other  hydrogeochemical  parameters.  Figure  2.8  shows  the 
unconditional  cross-correlation  pcy(x,x')  =  Pcy(x,x')/[ac(x)ay(x')]  (solid  contour 
lines)  with  respect  to  a  reference  location  x(2.159  m,  1.778  m,  0.762  m)  on  the 
horizontal  layer  Z4  at  time  0.1,  0.2,  0.5,  1.0,  1.5,  and  2.0  days.  Also  plotted  on 
Figure  2.8  are  pyy(x,x')  =  Pyy{x,  x')/[ay(x)ay(x')]  (the  colored  contour  lines)  and 
pcc(x,x')  =  Pcc(x,  x')/[ac(x)ac(x')]  (the  dashed  contour  lines).  The  unconditional 
contours  for  pyy(x,x')  remain  unchanged  over  time  and  cover  a  considerably  small 
region.  This  means  that  if  one  measures  the  volumetric  NAPL  content  (9n)  directly 
at  a  point,  one  will  not  be  able  to  have  much  information  about  NAPL  content  at 
locations  further  than  one  correlation  length  (Xy)  away  from  the  measurement  point. 
However,  the  contours  for  both  pcc(x,x')  and  pcy(x,x')  extend  over  a  much  broader 
region  both  longitudinally  and  transversely,  suggesting  that  measuring  tracer  concen- 
tration at  x  provides  a  greater  amount  of  information  on  NAPL  content  upstream 
than  the  direct  measurement  of  9n. 

Figure  2.8  indicates  that  if  the  concentration  measured  at  a  location  down- 
stream of  the  center  of  mass,  i.e.,  at  point  x=2.159  m  at  0.5  days,  is  higher  (lower) 
than  the  mean  concentration  c,  then  the  solute  is  less  (more)  retarded  than  expected 
and  likely  to  have  swept  a  region  with  the  NAPL  content  lower  (higher)  than  the 
mean  In  NAPL  content  ln0n;  hence  the  correlation  pcy(x,x')  is  negative.  After  the 
center  of  mass  travels  out  of  the  downstream  boundary,  for  instance,  at  time  1.5  days, 
higher  (lower)  concentration  than  c  measured  at  the  same  location  would  indicate  the 
solute  is  more  (less)  retarded  and  has  likely  swept  a  region  with  the  NAPL  content 
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higher  (lower)  than  the  mean  log  NAPL  content  ln#„;  hence  the  correlation  pcy(x,x') 
becomes  positive.  In  other  words  the  sign  of  pcy(x,  x')  changes  once  the  center  of  mass 
passes  by  x,  while  the  magnitude  of  pCj/(x,  x')  remains  almost  invariant  over  time. 

Similar  results  can  also  be  seen  from  the  contour  plot  of  pcqi  (x,  x')  on  Figure 
2.9.  In  contrast  to  pC!/(x,  x'),  pC(?1  (x,  x')  starts  to  be  positive  when  the  center  of  mass  is 
upstream  of  the  measurement  location  x,  it  then  changes  to  negative  after  the  center 
of  mass  passes  by  x.  This  is  reasonable  because  higher  (lower)  hydraulic  conductivity 
zone  results  in  higher  (lower)  Darcy  flux  and  thus  smaller  (larger)  solute  travel  time 
or  higher  (lower)  than  expected  solute  concentration.  These  plots  suggest  that  the 
information  from  local  concentration  measurements  regarding  NAPL  or  Darcy  flux 
distribution  is  essentially  constant  over  time.  This  implies  that  at  a  particular  loca- 
tion, a  single  concentration  measurement  at  one  time  may  provide  all  the  information 
available  to  estimate  NAPL  residual  content  as  well  as  the  Darcy  flux  for  the  region 
swept  by  tracer  and  repeated  measurements  at  that  location  may  provide  limited  ad- 
ditional information.  The  figures  also  indicate  that  the  highest  correlations  are  not  at 
x,  but  rather  at  areas  immediately  upstream  from  x,  and  that  little  information  can 
be  gained  further  downstream  than  one  correlation  Xy  away  from  the  measurement 
location  x. 

The  above  results  lead  to  an  important  implication  in  the  design  of  field  sam- 
pling networks.  For  tracer  experiments  conducted  under  field  conditions,  these  results 
suggest  that  spatial  density  of  sampling  network  may  be  more  important  than  the 
temporal  sampling  frequency.  A  snap-shot  sampling  scenario  would  appear  to  be 
more  informative  in  site  characterization  than  frequent  sampling  at  a  few  locations. 
The  magnitude  of  pC9l(x,  x')  is  much  greater  than  that  of  pcy(x,  x'),  suggesting  that 
the  flux  perturbation  is  more  correlated  to  the  concentration  perturbation  than  the 
In  9n  perturbation.  Therefore,  inverse  interpretation  of  concentration  measurements 
should  lead  to  more  confident  estimation  of  Darcy  flux  than  estimation  of  ln#n. 
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Figure  2.10  shows  the  unconditional  mean  concentration  c(x,  t)  for  two  cases 
with  weakly  correlated  y  and  /,  i.e.,  p  =  ±0.25.  Neither  the  mean  concentration 
contours  nor  the  synthetically  simulated  concentration  fields  are  very  different  despite 
the  difference  in  the  sign  of  the  correlation  of  the  NAPL  field.  This  again  suggests 
that  random  distribution  of  the  hydraulic  conductivity  poses  the  major  impact  on 
the  concentration  distribution. 

In  comparison  to  Figure  2.8  for  case  1  (p  =  0),  Figures  2.11  to  2.14  show 
the  normalized  cross-correlations  py!/(x, x'),  pcy(x,  x'),  and  pcc(x,  x')  at  horizontal 
layer  ZA  for  case  2  (p  =  0.25),  case  3  (p  =  —0.25),  case  4  (p  =  1.0),  and  case  5 
(p  =  —1.0),  respectively.  Again,  for  each  case  notice  the  change  of  sign  of  pcy(x,  x') 
after  the  center  of  mass  passes  the  measurement  location  x,  while  the  magnitude  of 
pcy(x,  x')  remains  essentially  unchanged  over  time.  The  magnitude  of  pC!/(x,  x')  for  the 
perfectly-correlated  cases  (p  =  ±1)  is  the  highest,  while  for  the  weakly  positively  cor- 
related case  (p  =  0.25),  the  magnitude  of  Pcy(x, x')  is  the  smallest.  For  the  case  where 
the  random  fields  of  \nK  and  \n9n  are  perfectly  negatively-correlated  (p  =  —1.0), 
the  higher  the  hydraulic  conductivity  in  a  region,  the  lower  the  NAPL  content,  hence 
the  faster  the  solute  particle  travels.  Hence  a  stronger  correlation  results  between  the 
concentration  c  measured  at  location  x  and  the  \n9n  at  location  x'.  It  is  worth  noting 
that  for  the  perfectly  positively-correlated  case  (p  =  1.0),  the  correlation  pCJ/(x,  x')  is 
initially  positive  (rather  than  negative  for  the  other  cases),  and  then  turns  negative 
as  center  of  mass  passes  x.  This  is  because  at  early  times  when  the  center  of  plume 
is  upstream  with  respect  to  x,  if  the  measured  concentration  c(x)  is  higher  (lower) 
than  its  mean  c(x),  we  can  expect  the  tracer  swept  volume  to  have  a  higher  (lower) 
hydraulic  conductivity.  Since  hydraulic  conductivity  and  NAPL  content  have  been 
assumed  to  be  perfectly  positively  correlated,  i.e.,  high  (low)  hydraulic  conductiv- 
ity implies  high  (low)  NAPL  content,  thus  higher  concentration  also  implies  higher 
NAPL  content. 
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Figure  2.15  compares  the  concentration  standard  deviation  for  all  five  cases. 
The  longitudinal  profile  of  the  concentration  standard  deviation  \/Fcc(x,  x)  is  shown 
along  the  control  line  YAZA  at  time  0.5  days.  This  figure  shows  that  cases  with 
negatively  correlated  random  In  K  and  In  9n  fields  generally  pose  higher  concentration 
prediction  uncertainty.  The  concentration  uncertainty  is  the  highest  for  the  perfectly 
negatively-correlated  case,  and  the  lowest  for  the  perfectly  positively-correlated  case. 

Figure  2.16  summarizes  the  longitudinal  macrodispersive  flux  P9lC(x,  x)  for 
the  five  cases.  The  longitudinal  macrodispersive  flux  is  generally  the  greatest  for 
case  5  (p  =  —1)  where  In  K  and  ln#n  are  perfectly  negatively  correlated.  Thus,  as 
discussed  earlier,  negative  correlation  between  InK  and  ln#„  fields  generally  leads 
to  an  enhanced  macrodispersion  compared  to  the  positive  correlation.  In  such  cases, 
high  conductivity  regions  also  imply  low  NAPL  residual  content,  i.e.,  low  retardation 
effect,  and  thus  the  solute  travels  through  these  regions  at  much  higher  velocity 
than  through  regions  with  low  conductivity  and  high  NAPL  content.  The  overall 
effect  is  that  the  mean  plume  is  increasingly  stretched  out,  and  therefore  enhanced 
macrodispersion  is  observed.  Similar  comparison  of  Pyc(x,  x)  is  shown  in  Figure  2.17. 
Again,  the  perfectly  negatively-correlated  cases  show  the  strongest  impact  of  the 
randomness  in  In  6n  on  the  ensemble  mean  solute  dispersion  at  the  macroscale. 

Figure  2.18  plots  the  actual  values  of  pcy(x,  x')  and  pC9l(x,  x')  versus  x  along 
the  center  line  Y4Z4  for  reference  point  x'  (2.159  m,  1.778  m,  0.762  m)  at  time 
0.5  days.  The  sensitivity  of  the  magnitude  of  pcy(x,  x')  appears  to  be  much  more 
significant  (see  Figure  2.18(A))  than  that  of  pC(?1(x,  x')  to  the  magnitude  of  p  values. 
Since  the  correlation  between  concentration  and  In  6n  (or  In  K)  at  two  different  points 
indicates  how  far  the  information  from  the  discrete  concentration  observations  can 
be  extrapolated  over  space  to  estimate  hydrogeochemical  parameters,  it  follows  that 
while  the  NAPL-conductivity  correlation  plays  a  crucial  role  in  spatial  estimation  of 
NAPL,  flux  estimation  seems  to  be  relatively  insensitive  to  this  condition.  Both  pcy 
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and  pcqi  drop  off  significantly  to  near  zero  immediately  downstream  of  x,  indicating 
that  parameter  estimation  will  likely  be  associated  with  great  uncertainty  downstream 
of  the  observation  locations. 

Finally,  we  look  at  a  special  situation  with  spatially  invariant  hydraulic  con- 
ductivity, i.e.,  NAPL  is  the  only  source  of  concentration  perturbations.  Figure  2.19 
shows  the  normalized  correlations  of  pra(x, x'),  pcy(x,x'),  and  pcc(x, x')  with  respect 
to  the  same  reference  point  x  on  the  horizontal  layer  ZA  at  time  0.2,  0.5,  1.0,  and  1.5 
days.  The  magnitude  of  pcy(x,  x')  is  greater  for  this  case  than  case  1  where  uncer- 
tainty of  both  In  K  and  In  9n  exists.  This  suggests  that  for  more  uniform  aquifers,  it  is 
reasonable  to  expect  higher  confidence  in  estimating  the  residual  NAPL  content  from 
partitioning  concentration  measurement.  Also  notice  that  the  magnitude  of  pcy(x,  x') 
is  similar  to  the  case  when  p  =  —  1  (Figure  2.14).  This  is  because  information  con- 
tained in  pcgi(x,  x')  contributes  to  information  in  pcy(x,  x')  when  InK  and  ln#n  are 
strongly  correlated. 

2.5  Summary 

The  objective  of  this  chapter  is  to  gain  a  better  understanding  of  reactive  solute 
migration  in  hydrogeochemically  heterogeneous  formations.  Particular  attention  has 
been  placed  on  the  statistical  correlation  between  the  concentration  observations  and 
the  aquifer  hydrogeochemical  parameters,  since  the  ultimate  aim  is  to  obtain  an 
optimal  estimate  of  the  spatial  distribution  of  these  parameters  by  incorporating  the 
unconditional  moments  into  a  moment  conditioning  algorithm. 

The  unconditional  simulations  performed  here  indicate  that  Darcy  flux  het- 
erogeneity, induced  directly  by  the  hydraulic  conductivity  heterogeneity,  is  the  pri- 
mary contributor  to  the  spatial  variation  and  macroscopic  spreading  of  the  concentra- 
tion field.  While  the  presence  of  NAPL  heterogeneity  does  not  affect  the  transverse 
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macrodispersion,  it  was  found  to  either  increase  or  decrease  longitudinal  macrodis- 
persion  depending  on  the  sign  of  correlation  between  NAPL  and  conductivity  fields. 
Contribution  of  NAPL  variability  to  the  longitudinal  macrodispersion  of  the  mean 
solute  plume  was  found  to  be  less  dominant  than  that  of  conductivity  variability.  The 
dependence  of  the  concentration  moments  on  the  statistical  moments  between  con- 
ductivity and  NAPL  was  also  investigated.  Negative  conductivity-NAPL  correlations 
generally  led  to  higher  concentration  prediction  uncertainty  and  higher  macroscopic 
spreading  than  positive  correlations. 

Analyses  of  cross-correlations  pcy(x,x')  and  /?C9l(x, x')  reveals  valuable  infor- 
mation for  sampling  network  design  in  site  characterization  using  tracers.  The  fact 
that  the  magnitudes  of  pcy(x,  x')  and  pC(?1(x,  x')  remain  unchanged  with  time  suggests 
that  at  a  particular  location  measuring  concentration  once  should  provide  most  of  the 
information  available  on  NAPL  and  conductivity  distribution  in  the  upstream  aquifer 
volume  swept  by  tracer.  Frequent  measurement  at  the  same  location  does  not  ap- 
pear to  significantly  increase  the  amount  of  information.  Hence  a  snap-shot  sampling 
scheme  should  be  most  effective  in  reducing  estimation  uncertainty.  Estimation  of 
flux  appears  to  be  insensitive  to  the  sign  and  degree  of  conductivity-NAPL  correla- 
tion. However,  NAPL  estimation  from  concentration  observations  depends  strongly 
on  conductivity-NAPL  correlation,  with  more  information  gained  as  p  decreases  from 
0.25  (weak  positive-correlation)  to  -1  (perfect  negative-correlation).  Weak  positive 
correlations  appear  to  lead  to  undistinguishable  signals  in  the  concentration  mea- 
surements. In  general,  high  concentration  observation  before  the  arrival  of  the  center 
of  mass  can  indicate  either  high  conductivity  or  low  NAPL.  If,  for  the  positively 
correlated  case,  high  conductivity  also  implies  high  NAPL  and  less  deviation  from 
the  mean  behavior  will  occur.  The  magnitude  of  pcqi  (x,  x')  is  generally  greater  than 
/9CJ/(x,x'),  thus  it  is  reasonable  to  expect  the  flux  distribution  to  be  estimated  more 
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accurately  than  the  NAPL  distribution.  In  hydrogeologically  uniform  porous  forma- 
tions, the  absence  of  conductivity  variation  causes  an  increase  in  NAPL-concentration 
correlation  and  thus  reduced  NAPL  prediction  uncertainty 
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Figure  2.8:  Unconditional  correlations  pw(x,  x'),  Pcy{x,  x'),  and  Pcc(x,  x')  with  respect 
to  reference  point  x  (2.159  m,  1.778  m,  0.762  m)  are  shown  as  solid  color  contour 
lines,  solid  contour  lines,  and  dashed  contour  lines.  Shown  above  is  horizontal  layer 
Z4  (0.762  m  above  the  base)  at  time  0.1,  0.2,  0.5,  1.0,  1.5,  and  2.0  days.  Case  No.l 


58 


0.1  days 


3.0- 


2.0- 


1.0- 


0.0 


0.5  days 


i — I — I — I — | — I — I — I — I — | — I — I — i — i — | — r 

1.0        2.0  3.0 


1 .0  days 


t-i — i — | — i — i — i — i — | — i — i — i — i — | — i — r 

1.0        2.0  3.0 


0.0  — i — i — i — i — I — i — i — i — i — i — i — i — i — i — i — i — r 

1.0        2.0  3.0 


r 

0.0       0.2       0.4       0.6       0.8  1.0 
Unconditional  normalized  Pyy(x,x'),  x=(9,4,4) 

Figure  2.9:  Unconditional  correlations  pyy(x,-x')  and  ^(XjX')  with  respect  to  refer- 
ence point  x  (2.159  m,  1.778  m,  0.762  m)  are  shown  as  solid  color  contour  lines  and 
solid  contour  lines.  Shown  above  is  horizontal  layer  Z4  (0.762  m  above  the  base)  at 
time  0.1,  0.5,  1.0,  and  1.5  days.  Case  No.l  (p  =  0) 
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Figure  2.10:  Unconditional  mean  concentration  is  shown  as  solid  contour  lines  super- 
imposed on  the  synthetically  simulated  concentration  field  shown  as  dashed  contour 
lines.  Shown  above  is  horizontal  layer  Z4  (0.762  m  above  the  base)  at  time  0.5,  1.0, 
and  1.5  days.  Case  No.2  (p  =  0.25)  and  Case  No.3  (p  =  -0.25) 
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Figure  2.11:  Unconditional  correlations  /9vy(x,  x'),  p^^x'),  and  pcc(x,x')  with  re- 
spect to  reference  point  x  (2.159  m,  1.778  m,  0.762  m)  are  shown  as  solid  color  contour 
lines,  solid  contour  lines,  and  dashed  contour  lines.  Shown  above  is  horizontal  layer 
Z4  (0.762  m  above  the  base)  at  time  0.2,  0.5,  1.0,  and  1.5  days  .  Case  No.2  (p  =  0.25) 
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Unconditional  normalized  Pyy(x,x'),x=(9,4,4) 

Figure  2.12:  Unconditional  correlations  pyy{x,  x'),  /^(x,  x'),  and  Pcc(x,  x')  with  re- 
spect to  reference  point  x  (2.159  m,  1.778  m,  0.762  m)  are  shown  as  solid  color  contour 
lines,  solid  contour  lines,  and  dashed  contour  lines.  Shown  above  is  horizontal  layer 
Z4  (0.762  m  above  the  base)  at  time  0.2,  0.5,  1.0,  and  1.5  days.  Case  No.3  (p  =  -0.25) 
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Figure  2.13:  Unconditional  correlations  pyy(x,x'),  /^(XjX7),  and  p^^x')  with  re- 
spect to  reference  point  x  (2.159  m,  1.778  m,  0.762  m)  are  shown  as  solid  color  contour 
lines,  solid  contour  lines,  and  dashed  contour  lines.  Shown  above  is  horizontal  layer 
Z4  (0.762  m  above  the  base)  at  time  0.2,  0.5,  1.0,  and  1.5  days.  Case  No.4  (p  =  1) 
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Figure  2.14:  Unconditional  correlations  Pj,y(x, x'),  Pcj,(x, x'),  and  ^(x, x')  with  re- 
spect to  reference  point  x  (2.159  m,  1.778  m,  0.762  m)  are  shown  as  solid  color  contour 
lines,  solid  contour  lines,  and  dashed  contour  lines.  Shown  above  is  horizontal  layer 
Z4  (0.762  m  above  the  base)  at  time  0.2,  0.5,  1.0,  and  1.5  days.  Case  No. 5  (p  =  — 1) 


Figure  2.15:  Comparison  of  Unconditional  concentration  standard  deviation 
along  control  line  Y4Z4  at  time  0.5  days. 
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Figure  2.16:  Comparison  of  unconditional  longitudinal  macrodispersive  flux  P9lC(x,  x) 
versus  x  along  control  line  Y4Z4  at  time  0.5  days. 
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Figure  2.17:  Comparison  of  unconditional  Fyc(x,x)  at  (1.397  m,  1.778  m,  0.762  m) 
versus  time. 
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Figure  2.18:  [A]  Comparison  of  Unconditional  pyc(x,  x')  with  respect  to  reference 
point  x'  (2.159  m,  1.778  m,  0.762  m)  versus  x  along  control  line  Y4Z4  at  time  0.5 
days.  [B]  Comparison  of  Unconditional  pgiC(x,x')  with  respect  to  reference  point  x 
(2.159  m,  1.778  m,  0.762  m)  versus  x  along  control  line  Y4Z4  at  time  0.5  days. 
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Unconditional  normalized  Pyy(x,x'),x=(9,4,4) 

Figure  2.19:  Unconditional  correlations  pyy(x,  x'),  ^(x,  x'),  and  /^(x,  x')  with  re- 
spect to  reference  point  x  (2.159  m,  1.778  m,  0.762  m)  are  shown  as  solid  color  contour 
lines,  solid  contour  lines,  and  dashed  contour  lines.  Shown  above  is  horizontal  layer  Z4 
(0.762  m  above  the  base)  at  time  0.2,  0.5,  1.0,  and  1.5  days.  Hydraulic  conductivity 
is  spatially  invariant. 


CHAPTER  3 
INFERENCE  OF  CONDITIONAL  MOMENTS 

3.1    Introduction  and  Literature  Review 

Accurate  description  of  transport  of  chemicals  in  hydrogeochemically  hetero- 
geneous porous  media  is  needed  in  environmental  applications  of  groundwater  mod- 
eling for  site  characterization  and  remediation  of  aquifers  contaminated  by  residual 
nonaqueous  phase  liquids  (NAPLs).  Numerical  simulation  of  groundwater  flow  and 
transport  problems  relies  on  knowledge  of  the  spatial  variation  of  the  aquifer  sys- 
tem parameters,  inputs,  and  boundary  conditions.  Given  a  physical  system,  through 
fundamental  "laws"  and  empirical  testing,  one  can  employ  deterministic  models  to 
establish  the  interrelationships  among  certain  variables  of  interest,  inputs  to  the  sys- 
tem, and  output  from  the  system.  In  order  to  observe  the  actual  system  behavior, 
measurement  devices  are  constructed  to  output  data  signals  which  are  the  only  in- 
formation that  is  directly  discernible  about  the  system  behavior. 

A  deterministic  model  does  not  provide  a  totally  sufficient  means  of  describ- 
ing the  actual  system  for  two  reasons:  First,  no  mathematically  based  deterministic 
model  is  perfect.  Since  the  objective  of  such  models  is  to  represent  the  dominant 
modes  of  system,  many  effects  are  knowingly  left  unmodeled  or  modeled  via  mathe- 
matical approximations.  It  is  often  the  case  that  various  parameters  within  the  system 
structure  are  not  absolutely  determined,  leading  to  one  of  many  sources  of  uncertain- 
ties in  deterministic  models.  Second,  measurement  devices  do  not  provide  perfect  and 
complete  data  about  a  system.  Even  the  most  precise  devices  are  always  corrupted 
by  noise  to  a  certain  degree.  Therefore,  parameter  estimation  or  inverse  problems  are 
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likely  to  arise  whenever  mathematical  models  are  used  to  depict  a  specific  real-world 
system. 

The  inverse  problem  in  groundwater  modeling  is  defined  as  estimation  of  the 
spatially  varying  aquifer  hydrogeochemical  parameters  for  mapping  and/or  subse- 
quent use  in  forward  flow  and  transport  models.  In  NAPL-contaminated  aquifers, 
the  parameters  of  interest  generally  include  (1)  hydraulic  conductivity,  (2)  NAPL 
residual  saturation,  (3)  Darcy  flux,  and  (4)  distribution  coefficients  for  sorbing  so- 
lutes. These  parameters  are  considered  as  independent  variables  in  traditional  for- 
ward groundwater  models.  Inverse  algorithms  are  formulated  so  that  the  independent 
parameters  and  their  spatial  structures  can  be  identified  from  a  knowledge  of  depen- 
dent variables  such  as  concentration,  hydraulic  conductivity,  and/or  hydraulic  head 
at  a  limited  number  of  spatial  locations. 

Treating  groundwater  flow  and  transport  problems  in  a  probabilistic  frame- 
work has  been  widely  accepted  as  a  successful  methodology  in  dealing  with  aquifer 
heterogeneities.  To  describe  a  dynamic,  distributed  system,  as  is  often  the  case  in 
many  subsurface  flow  and  transport  problems,  stochastic  approach  presents  a  prac- 
tical alternative  to  deterministic  approaches.  The  motivation  for  using  stochastic 
models  as  a  substitute  for  deterministic  models  includes  the  facts  that  they  [82] 

•  account  for  the  uncertainties  in  a  direct,  quantitative,  hence  practical  fashion; 

•  provide  a  basis  to  optimally  estimate  the  quantities  of  interest,  given  the  in- 
complete, noise-corrupted  measurement  data; 

•  provide  a  means  to  evaluate  the  performance  of  the  estimation  system. 

Stochastic  methods  view  the  parameters  of  interest  as  random  fields  that  can  be 
described  in  terms  of  a  few  fundamental  statistical  variables  such  as  the  mean  and 
covariance.  The  ensemble  of  formations  on  which  statistical  calculations  are  carried 
out  represents  all  aquifers  with  the  same  multivariable  probability  density  functions. 
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One  of  the  most  important  tools  provided  by  the  statistical  theory  is  aimed 
at  the  reduction  of  the  prediction  uncertainty.  Stochastic  modeling  is  most  useful  if 
a  particular  realization,  reflecting  a  specific  field  aquifer,  can  be  singled  out  from  the 
modeled  ensemble  or  can  be  represented  accurately  by  the  ensemble  moments.  Many 
of  the  early  stochastic  analyses  assumed,  explicitly  or  implicitly,  the  ergodic  hypothe- 
sis which  states  that  the  ensemble  averages  are  equivalent  to  the  spatial  averages  of  a 
single  realization.  The  ergodic  assumption  allows  the  derivation  of  macroscopic  effec- 
tive parameters  which  may  be  directly  used  in  the  physically-based  flow  and  transport 
equations  in  order  to  simulate  field  conditions.  However,  the  ergodic  assumption  is 
applicable  only  when  the  scale  of  the  system  is  large  compared  with  the  largest  cor- 
relation scale  of  aquifer  hydrogeochemical  heterogeneity.  Dagan  [35,  36]  pointed  out 
that  the  requirement  of  the  ergodic  assumption  in  unconditional  stochastic  theories 
also  imposes  certain  restrictions  on  initial  plume  size,  shape,  and  orientation.  An 
effective  advection-dispersion  equation  containing  an  effective  macrodispersive  coef- 
ficient can  be  used  to  model  the  ensemble  averaged  concentration  field.  However,  it 
does  not  reproduce  the  irregularities  existing  in  particular  concentration  distributions, 
and  hence  may  not  be  pertinent  in  real  situations  [75,  76].  To  render  stochastic  trans- 
port models  applicable  to  a  wide  range  of  field  conditions,  it  is  important  to  eliminate 
the  requirement  of  stationarity,  and  to  mitigate  the  consequences  of  nonergodicity, 
by  conditioning  the  models  on  site  measurements  [134].  Without  the  incorporation 
of  the  site-specific  information  general  stochastic  methods  can  serve  for  predictive 
purposes  only  under  the  rarely  met  condition  of  "ergodicity"  [102]. 

One  method  to  make  the  ensemble  statistics  site-specific  is  to  utilize  available 
measurement  data.  Statistics  of  random  variables  are  referred  to  as  unconditional  if 
they  do  not  depend  on  the  values  of  measurements  taken  from  a  particular  realization 
of  the  random  field,  or  as  conditional  if  they  depend  on  realization-specific  measured 
values.   The  conditioning  process  takes  advantage  of  the  probabilistic  relationship 


72 


between  correlated  random  fields  and  conditions  the  unconditional  ensemble  moments 
based  upon  the  available  measurements.  Therefore,  the  resulting  conditional  mean 
represents  the  best  estimate  of  the  random  variable  given  the  available  measurements, 
while  the  conditional  variance  provides  a  measure  of  the  uncertainty  of  the  estimate. 
It  can  be  shown  that  the  conditional  variance  is  always  less  than  or  equal  to  the 
unconditional  variance,  and  it  is  equal  to  zero  (or  to  the  square  of  the  presumed 
measurement  error,)  at  measurement  locations  for  the  random  field  being  measured 
[89]. 

Groundwater  inverse  modeling  has  been  under  intensive  study  in  the  past 
two  decades.  Some  of  the  significant  works  have  been  summarized  by  Dagan  [30], 
Hoeksema  and  Kitanidis  [68],  Yeh  [133],  Carrera  [15],  and  Sun  [122].  In  a  recent 
review  paper,  McLaughlin  and  Townley  [85]  used  a  functional  analysis  to  conduct  an 
assessment  on  most  of  the  algorithms  applied  in  groundwater  inverse  problems.  They 
pointed  out  that  despite  the  adoption  of  blocked  description  of  spatial  variability  in 
early  inverse  models  (e.g.,  [18,  19]),  most  groundwater  inverse  algorithms  adopt  a 
geostatistical  viewpoint  in  combination  with  stochastic  treatment  of  system,  which 
refers  the  properties  of  interest  as  stationary  or  nonstationary  random  fields. 

Kitanidis  and  Vomvoris  [74]  developed  a  geostatistical  approach  for  solving  the 
inverse  problem  in  which  the  log  transmissivity  was  treated  as  a  stochastic  random 
field.  A  few  statistical  parameters  defining  the  mean  and  covariance  function  of  the 
random  field  were  determined  by  the  maximum  likelihood  estimation  [18,  19,  67]. 
Hoeksema  and  Kitanidis  [67]  applied  this  approach  in  their  linear  geostatistically  ori- 
ented steady-state  inverse  algorithm  to  estimate  transmissivity  from  head  and  trans- 
missivity measurements.  The  method  was  demonstrated  using  both  a  synthetically 
generated  aquifer  and  field  data  from  the  Jordan  aquifer  in  Iowa.  The  Hoeksema 
and  Kitanidis  approach  to  the  steady-state  inverse  flow  problem  has  been  extended 
to  a  dynamic  case  by  Sun  and  Yeh  [124]  who  estimated  hydraulic  conductivity  from 
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transient  head  observations.  Through  a  synthetic  example  they  found  that  the  si- 
multaneous use  of  head  measurements  from  all  measurement  times  produced  much 
better  results  for  estimating  hydraulic  conductivity  than  the  sequential  quasi-steady 
approach  developed  by  Dagan  and  Rubin  [41]. 

Dagan  [30]  used  the  Gaussian  conditional  mean  technique  in  a  simplified  max- 
imum a  posteriori  procedure  to  estimate  transmissivity  from  measurements  of  head 
and  log  transmissivity  at  a  few  points.  In  a  two-dimensional  steady-state  flow  prob- 
lem, he  derived  analytical  expressions  for  the  covariances  and  cross-covariances  of 
the  transmissivity  and  head  fields  using  a  Green's  function  technique.  Dagan's  work 
indicated  that  combinations  of  head  and  transmissivity  measurements  can  lead  to  a 
substantial  reduction  in  the  prediction  uncertainty  of  the  transmissivity  field  com- 
pared with  conditioning  with  transmissivity  measurements  alone.  Dagan's  linear 
estimation  approach  was  later  extended  by  Rubin  and  Dagan  [105],  who  used  a  maxi- 
mum likelihood  technique  to  estimate  a  constant  random  recharge  as  well  as  the  mean 
log  conductivity,  the  components  of  the  mean  head  gradient,  and  several  statistical 
coefficients  associated  with  the  prior  covariance.  The  procedure  was  demonstrated 
with  the  data  from  the  Avra  Valley  aquifer,  and  it  has  shown  that  accounting  for 
both  transmissivity  and  head  measurements  can  significantly  reduce  the  uncertainty 
in  transmissivity  prediction. 

Although  linear  inverse  algorithms  have  the  advantages  of  being  simple  and 
reasonably  robust,  they  require  the  introduction  of  certain  approximations  in  the 
maximum  a  posteriori  problem  formulation.  This  is  because  Bayesian  parameter  es- 
timates generally  depend  nonlinearly  on  measurements  when  forward  equations  are 
nonlinear.  Nonlinear  inverse  algorithms  do  not  involve  as  many  assumptions,  how- 
ever, they  are  less  likely  to  provide  unique  solutions  and  can  be  practically  difficult 
to  apply  [85].  Nonlinear  methods  become  increasingly  attractive  because  they  are 
capable  of  carrying  out  "history  matching"  exercises,  hence  can  be  effectively  applied 
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in  transient  groundwater  problems  involving  contaminant  plume  predictions.  Repre- 
sentative nonlinear  groundwater  inverse  algorithms  can  be  found  in  works  by  Marsily 
[80],  Marsily  et  al.  [81],  McLaughlin  and  Wood  [86,  87],  Certes  and  Marsily  [20], 
Graham  and  McLaughlin  [62],  Reid  and  McLaughlin  [100],  Graham  and  Tankersley 
[64],  Graham  and  Neff  [65],  and  Snodgrass  and  Kitanidis  [115].  Most  of  these  algo- 
rithms are  based  on  a  common  philosophy  of  progressive  refinement  of  the  "prior" 
estimates  on  the  measurements  until  both  estimates  of  parameters  and  predictions 
can  no  longer  be  improved. 

A  pilot  point  method  was  introduced  by  Marsily  [80].  It  is  based  on  the  idea 
that  the  effective  log  conductivity  can  be  approximated  by  a  smooth  function  which 
reproduces  available  log  conductivity  measurements  while  giving  an  acceptable  fit  to 
head  data.  A  nonlinear  least  squares  approach  was  incorporated  in  the  algorithm  for 
estimating  the  unknown  pilot  values.  Since  the  least  squares  performance  index  being 
minimized  involved  only  the  head  measurements,  and  no  constraint  was  imposed  on 
the  conductivity  estimates,  its  stability  may  be  questionable  [85].  Other  applications 
using  the  pilot  point  methods  with  respect  to  transient  head  data  records  in  a  few 
field  tests  have  been  reported  by  Marsily  et  al.  [81]  and  Certes  and  Marsily  [20]. 

Graham  and  McLaughlin  [62]  presented  a  stochastic  model  for  predicting  the 
migration  of  a  nonreactive  solute  in  a  two-dimensional  saturated  aquifer  under  site- 
specific  conditions.  By  conditioning  the  ensemble  moments  on  field  observations  of 
log  conductivity,  head,  and  concentration,  improved  concentration  predictions  and 
site-specific  estimation  of  velocity  components  were  achieved.  The  conditioning  pro- 
cedure was  carried  out  by  a  distributed  parameter  extended  Kalman  filter  which  is 
a  nonlinear  recursive  estimator  that  updates  estimates  with  only  the  most  recent 
measurements.  The  state  equations,  derived  from  the  conventional  solute  transport 
equation  using  a  stochastic  perturbation  method,  propagate  the  state  variables  and 
covariances  between  measuring  times.   This  model  was  tested  successfully  for  two 
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synthetic  cases  [62]  and  later  applied  to  the  Borden  field  tracer  test  [63].  Graham 
and  Tankersley  [64]  and  Graham  and  Neff  [65]  applied  a  similar  optimal  estimation 
procedure  to  the  Upper  Floridian  Aquifer  in  NE  Florida  for  the  estimation  of  both 
transmissivity  and  recharge  fields. 

A  Lagrangian  approach  to  the  conditioning  of  two-dimensional  transport  on 
hydraulic  and  concentration  data  was  described  in  a  series  of  papers  by  Rubin  [102, 
103,  104],  and  Rubin  and  Dagan  [106].  Their  approach  combined  low-order  semiana- 
lytic  conditioning  of  velocity  fields  on  measurements  through  the  means  of  cokriging, 
conditional  Monte  Carlo  simulation  of  corresponding  Gaussian  velocity  fields,  and  the 
tracking  of  one  or  two  particles  through  each  simulated  field.  Conditional  ensemble 
concentration  moments  were  obtained  from  the  conditional  one-  and/or  two-particle 
spatial  moments  upon  the  assumption  of  that  the  corresponding  mean  concentration 
is  spatially  Gaussian.  Rubin  [102]  evaluated  the  conditional  particle  spatial  moments 
and  concentrations  induced  by  an  instantaneous  source  of  small  area  based  on  locally 
measured  log  transmissivity  and  hydraulic  head  data.  Rubin  and  Dagan  [106]  later 
used  log  transmissivity  data  to  condition  the  single-particle  travel  times  (subject  to 
an  instantaneous  point  source).  They  showed  that  prediction  uncertainty  could  be 
reduced  by  increasing  the  density  of  measurements  in  a  control  plane  sufficiently  far 
downstream  from  the  source. 

Neuman  [91]  developed  a  unified  Eulerian-Lagrangian  theory  of  transport  con- 
ditioned on  hydraulic  data  in  spatial-temporally  nonstationary  velocity  fields.  An  ex- 
act, closed-form  transport  equation  in  terms  of  the  first  two  conditional  Lagrangian 
velocity  moments  and  the  conditional  Lagrangian  cross-covariance  between  velocity 
and  its  forcing  terms  was  presented.  Upon  the  knowledge  of  these  three  Lagrangian 
moments,  the  equation  can  be  solved  exactly  for  the  conditional  ensemble  mean  solute 
concentration  and  flux.  The  advantages  of  Neuman's  Eulerian-Lagrangian  theory  is 
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that  it  avoids  approximations  which  are  generally  employed  in  most  Eulerian  theo- 
ries, and  that  it  does  not  require  any  prior  distributional  assumption  about  particle 
displacements  to  predict  solute  concentration  and  flux.  However,  applicability  of 
this  approach  in  field  practices  is  questionable  because  Lagrangian  velocities  in  the 
subsurface  are  difficult,  if  not  impossible,  to  measure. 

Knowledge  of  the  extent  and  distribution  of  N  APLs  present  as  residual  phase  in 
the  subsurface  environment  is  crucial  to  the  success  of  field  remediation  practices,  and 
is  generally  difficult  and  expensive  to  obtain  through  traditional  field  techniques  such 
as  soil  coring  and  cone  penetrometer  testing.  Recently,  alternative  methods  such  as 
interwell  partitioning  tracer  test  (IWPT)  which  involves  the  simultaneous  injection 
of  several  tracers  with  different  NAPL-water  partitioning  coefficients  into  NAPL- 
contaminated  porous  formations  are  discussed  by  Jin  et  al.  [72]  and  Annable  et  al.  [2, 
5].  Since  the  separation  between  the  tracer  pulses  depend  on  both  the  residual  NAPL 
saturation  and  the  partitioning  coefficients,  it  is  possible  to  calculate  the  volumetric 
residual  NAPL  content  averaged  over  the  tracer-swept  volume  by  analyzing  the  tracer 
breakthrough  curves  (BTCs)  at  extraction  wells  if  the  partition  coefficient  of  the 
partitioning  tracer  is  known.  This  method,  however,  does  not  generate  a  distribution 
map  of  NAPL  residual  content.  In  a  field  tracer  experiment  recently  conducted  by 
Annable  et  al.  [3],  a  three-dimension  sampling  network  was  utilized  to  measure  tracer 
concentrations  at  5  distinct  vertical  and  12  distinct  horizontal  locations.  The  purpose 
of  this  research  is  to  provide  an  effective  means  to  interpret  this  three-dimensional 
tracer  breakthrough  data  so  that  the  three-dimensional  residual  NAPL  distribution 
can  be  identified. 

Estimation  of  NAPL  residual  saturation  field  using  inverse  methods  has  not 
yet  received  much  attention.  A  recent  study  by  James  et  al.  [70]  introduced  a  stochas- 
tic method  to  estimate  distribution  of  NAPL  residual  content  from  integrated  tracer 
breakthrough  curve  moments  at  three-dimensionally  located  sampling  points.  Using 
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first-order  stochastic  perturbation  methods,  James  et  al.  [70]  derived  partial  differ- 
ential equations  for  the  covariances  and  cross-covariances  between  the  partitioning 
tracer  temporal  moments,  non-partitioning  tracer  temporal  moments,  residual  NAPL 
saturation,  pore  water  velocity,  and  hydraulic  conductivity  fields  in  steady-state  flow 
field.  These  differential  equations  were  then  solved  numerically  and  incorporated  into 
a  conditioning  algorithm  to  produce  the  optimal  estimates.  This  model  was  tested 
with  synthetically  generated  data  set,  patterned  after  the  tracer  tests  conducted  at 
Hill  AFB  by  Annable  et  al.  [3].  The  algorithm  was  shown  to  successfully  predict  the 
major  features  of  the  synthetically  generated  NAPL  distribution. 

The  main  aim  of  present  study  is  to  present  a  systematic  model  to  estimate 
aquifer  hydrogeochemical  parameters  from  measured  partitioning  and  nonpartitioning 
tracer  concentration  data.  Unconditional  moment  analyses  for  a  partitioning  solute  in 
a  NAPL-contaminated  heterogeneous  aquifer  was  presented  in  the  last  chapter,  and 
statistical  correlations  between  random  fields  were  discussed  in  detail.  In  this  chapter 
these  correlations  will  be  utilized  with  site  specific  concentration  measurements  to 
optimally  estimate  concentration  and  other  aquifer  parameters  of  interest  using  a 
distributed  parameter  extended  Kalman  filter. 

The  methodology  used  in  this  work  is  similar  to  that  utilized  by  Graham  and 
McLaughlin  [62]  and  James  et  al.  [70].  The  state  variables  considered  here  consist  of 
solute  concentration,  three-dimensional  components  of  Darcy  flux,  and  NAPL  con- 
tent. Based  on  a  stochastic  perturbation  method,  the  propagation  equations  for  the 
moments  of  the  state  variables  can  be  derived  (see  last  chapter  for  details).  Concen- 
tration measurements  at  discrete  times  are  used  in  a  distributed  parameter  extended 
Kalman  filter  to  update  the  ensemble  moments.  One  of  the  major  differences  between 
the  extended  Kalman  filter  developed  here  and  James'  steady  state  algorithm  is  that 
this  algorithm  does  not  rely  on  full  concentration  breakthrough  curves  to  achieve 
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the  conditional  moments  as  does  James'  method.  The  extended  Kalman  filter  algo- 
rithm is  able  to  generate  optimal  estimates  of  the  state  variables  using  only  a  few 
discrete  time  measurements  along  each  breakthrough  curves,  which  is  practically  ad- 
vantageous. Synthetic  cases  will  be  studied  to  demonstrate  the  applicability  of  the 
algorithm. 

3.2    Theory,  Kalman  Filtering  Equations 

The  problem  of  determining  the  state  of  a  system  from  noisy  measurements  is 
called  estimation,  or  filtering  [71].  Stochastic  filtering  theory  has  been  applied  in  var- 
ious disciplines  since  1960s  following  the  fundamental  work  of  Kalman  and  Bucy  [73] 
in  linear  filtering  theory,  as  well  as  work  of  Stratonovich  [120]  in  nonlinear  filtering 
theory  (see  detailed  discussion  by  Jazwinski  [71],  Schweppe  [111],  and  Maybeck  [82].) 
The  ordinary  Kalman  filter  is  a  linear,  optimal  recursive  data  processing  algorithm. 
A  Kalman  filter  may  proceed  with  respect  to  virtually  any  optimal  criterion  by  incor- 
porating all  available  measurements  to  estimate  the  current  values  of  state  variables 
of  interest.  By  combining  all  the  information  concerning  of  (1)  system  dynamics  and 
system  uncertainty,  (2)  measurements  and  measurement  noise,  and  (3)  initial  condi- 
tions of  the  variables  of  interest,  a  Kalman  filter  is  expected  to  generate  an  overall 
best  estimate  of  the  state  in  such  a  manner  that  the  estimation  error  is  minimized 
statistically.  The  recursive  nature  of  a  Kalman  filter  provides  an  important  advan- 
tage in  the  filter  implementation  because  new  measurements  need  to  be  processed 
only  once.  Consequently,  all  previously-processed  data  does  not  need  to  be  kept  in 
storage.  If  a  Bayesian  viewpoint  is  adopted,  it  is  the  conditional  probability  density  of 
the  desired  variables  that  must  be  propagated  by  the  filter,  conditioned  on  knowledge 
of  the  data  from  the  measuring  devices.  A  Kalman  filter,  which  propagates  the  first 
and  second  order  statistics,  would  essentially  include  all  information  contained  in  the 
conditional  probability  density  under  three  conditions:  (1)  the  uncertain  variables  are 
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assumed  to  be  normally  distributed,  (2)  the  system  can  be  described  through  a  linear 
model,  and  (3)  measurement  error  is  expressed  with  white  Gaussian  noise.  In  this 
case,  the  Kalman  filter  can  be  shown  to  be  the  best  filter  of  any  conceivable  form  and 
produces  a  unique  "best"  estimates  to  the  variables  of  interest  [82].  If  the  normality 
assumption  of  the  conditional  probability  density  is  removed,  the  Kalman  filter  can 
be  shown  to  be  the  best  linear  estimate  in  the  sense  of  minimum  variance.  Under 
the  circumstances  that  the  system  linearity  does  not  exist,  the  Kalman  filter  must 
generally  be  extended  to  enlarge  the  range  of  applicability  [111]. 

An  extended  Kalman  filter  is  a  nonlinear  algorithm  that  is  commonly  used 
to  accommodate  nonlinear  filtering  problems  in  which  best  estimates  of  the  spatial 
distribution  of  state  variables  are  sought  based  on  measurements  available  only  at 
discrete  times  [71,  62].  Figure  3.1  depicts  the  process  of  estimation  propagation  from 
time  ti-i  to  ti  for  an  extended  Kalman  filter.  Consider  two  measurement  times, 
ti-i,  and  the  filter  will  propagate  optimal  estimates  from  the  point  just  after  the 
measurement  at  time  has  been  incorporated  into  the  estimate,  denoted  by  tf^ 
to  the  point  just  after  the  measurement  at  time  ti,  denoted  by  t^.  From  a  Bayesian 
point  of  view,  the  conditional  probability  density  at  time  tf_l  are  based  on  the  entire 
measurement  history  to  that  time.  The  conditional  probability  density  propagates 
forward  to  the  next  measurement  time  ti  and  is  updated  again  to  generate  the  optimal 
estimates  at  time  tf.  This  process  is  repeated  whenever  a  new  set  of  measurements 
become  available.  Thus,  it  is  intuitively  convenient  to  subdivide  the  extended  Kalman 
filter  into  two  parts:  (1)  a  propagation  component  which  specifies  how  the  first  and 
second  statistical  moments  of  state  variables  evolve  from  measurement  time  tf_l  to 
measurement  time  t'1 ,  and  (2)  an  update  component  which  conditions  the  moments 
based  on  the  newly  available  measurements  at  time  ij. 
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Figure  3.1:  Estimate  propagation.  Concentration  measurements  C*_1  and  C*  become 
available  at  time       and  time  ti,  respectively. 

The  optimal  estimation  procedure  developed  here  is  for  the  purpose  of  obtain- 
ing the  "best"  estimates  of  hydrogeochemical  parameters  from  discrete-time  concen- 
tration measurements  of  partitioning  tracers  in  an  aquifer  saturated  with  water  and 
small  amount  of  NAPL  in  residual  phase.  The  basic  assumptions  include  the  follow- 
ing: (1)  the  unconditional  hydraulic  conductivity  field  is  stationary  and  statistically 
isotropic,  and  cause  the  spatial  variation  in  Darcy  flux  field,  which  is  stationary  but 
anisotropic  under  the  condition  of  unidirectional,  steady-state  hydraulic  gradient  [57]; 
(2)  the  unconditional  NAPL  residual  saturation  field  is  time-invariant;  (3)  the  result- 
ing tracer  concentration  field  is  transient  and  nonstationary.  Thus  the  state  variables 
in  this  problem  include  concentration,  components  of  Darcy  flux,  and  volumetric 
NAPL  content.  The  propagation  component  of  this  filter  is  represented  through  the 
set  of  coupled  stochastic  partial  differential  equations  derived  in  the  previous  chap- 
ter from  the  advection-dispersion-retardation  equation  using  stochastic  perturbation 
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methods.  These  equations  are  solved  numerally  for  moment  propagation  between 
measurement  times.  The  update  part  of  this  filter  performs  a  function  similar  to  the 
algorithm  of  cokriging  with  known  prior  mean.  The  development  of  the  extended 
Kalman  filter  is  illustrated  in  the  subsequent  sections. 

3.2.1    Problem  formulation 

Under  steady-state  flow  conditions,  the  deterministic  description  for  a  reactive 
solute  migrating  through  a  heterogeneous  saturated  aquifer  with  random  distribu- 
tion of  residual  NAPL  saturation  is  governed  by  the  advection-dispersion-retardation 
equation: 

dc 


dx3. 


xGD  (3.1) 


dc 


c0       x  G  D,    t  =  t0  (3.2) 


q\Cb   for     0  <  t  <  tb,    x  G  5D; 


Qlc  -  ewDn—  =  I  0       for     t  >  tb,    x  G  dDin  (3.3) 


dx\ 


qxc    for     t,    xe  dT>0Ut 


dc 

—  =  0    for    t,    x  G  dDv  (3.4) 

dx<i 

dc 

—  =  0    for    *,    x  G  dBz  (3.5) 

dx3 

where  volumetric  water  content  6W,  partitioning  coefficient  KN  ,  and  local  dispersion 
coefficient  Dij,  i,j=l ,2,3  are  assumed  to  be  spatially-invariant,  deterministic  constants. 
The  location  vector  x^,^,  x3)  is  defined  over  domain  D  with  boundary  dD.  In  this 
particular  problem,  a  constant  hydraulic  gradient  is  maintained  in  the  longitudinal 
direction,  and  dDin  and  dT>0Ut  represents  the  upstream  and  downstream  boundaries 
perpendicular  to  this  direction,  respectively.  A  partitioning  tracer  is  injected  through 
d~Din  for  a  length  of  time  tb.  At  boundaries  parallel  to  the  mean  flow  direction, 
i.e.,  dDy  and  dDz,  a  zero  mass  flux  boundary  condition  is  each  implied  applicable 
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by  equation  (3.4)  and  (3.5),  respectively.  In  addition,  the  initial  concentration  Co, 
injection  concentration  cb,  and  time  of  injection  tf,  are  regarded  to  be  determinis- 
tic constants.  It  is  further  assumed  that  the  local  dispersion  coefficient  tensor  Dij 
([L2T-1])  is  directly  related  to  the  pore  water  velocity  v  as  suggested  by  Bear  [7],  i.e., 

=  aTv8ij  +  (aL  -  ar)-^  (3.6) 

where  ai  and  qt  represent  local-scale,  deterministic,  longitudinal  and  transverse 
dispersivities;  and  v  is  the  magnitude  of  the  pore  water  velocity.  Consideration  of 
molecular  diffusion  has  been  omitted  because  our  domain  is  within  the  definition 
of  field  scale.  In  equation  (3.3)  through  (3.5),  Dn  represents  the  longitudinal  local 
dispersion  coefficient. 

Quantities  that  are  assumed  to  be  statistically  random  include  components 
of  Darcy  flux,  qi,i  =  1,2,3,  volumetric  NAPL  content  9n,  which  are  temporally- 
invariant;  and  concentration  c,  which  varies  with  time.  For  simplicity,  these  random 
quantities  can  be  expressed  by  a  composite  state  vector  Z  ,  in  matrix  form, 

Z  =  [Cr(x,0,qr(x),qr(x),q^(x),  Yr(x)]T  (3.7) 

where  superscript  T  represents  the  matrix  transpose.  Note  that  the  location  vectors 
are  denoted  by  bold  face  letters  and  that  it  is  understood  that  in  three-dimensions 
x  =  x  (a?!,  x2,  x3)  with  subscript  "1"  representing  the  index  in  longitudinal  direction, 
while  "2"  and  "3"  representing  indices  in  transverse  directions.  Since  the  ultimate  goal 
is  to  obtain  physically  meaningful  estimate  of  NAPL  content,  the  natural  logarithm  of 
volumetric  NAPL  content  (ln#„,  or,  Y)  is  used  to  avoid  possible  negative  estimates  of 
9n.  In  a  discrete  system,  C(x,  t),  qi(x),  and  Y(x)  each  represents  an  n-vector,  where 
n  is  the  number  of  blocks  into  which  the  system  domain  is  divided.  Overall,  Z  has  a 
dimension  of  5n  x  1. 

The  composite  state  vector  Z  consists  either  time-evolving  variables  to  be 
predicted,  such  as  concentration  c(x,  t)  which  is  the  dependent  variable,  or  unknown 


83 


system  parameters  to  be  inversely  estimated,  such  as  flux  qi  and  \ndn  which  are  the 
independent  variables.  Equation  (3.2)  implies  that  there  is  no  initial  concentration 
uncertainty  if  there  is  no  presence  of  solute  in  the  domain.  Since  all  the  boundary 
conditions  are  known  perfectly,  the  evolving  concentration  uncertainties  (or  variance) 
is  attributed  only  to  the  variations  in  the  hydrogeochemical  properties  of  the  system. 
The  time- varying  behavior  of  the  mean  concentration  plume  as  well  as  the  associated 
macrodispersion  and  uncertainty  are  governed  directly  by  the  local-scale,  physical 
and  chemical  processes  described  in  the  solute  transport  equation  (3.1),  including  the 
influence  of  heterogeneities  of  NAPL  and  hydraulic  conductivity. 

Measurements  are  available  at  discrete  times,  t\,  t2,  •  ■  •  ,  U,  ■  •  ■  (often,  but  not 
necessarily,  equally  spaced  in  time),  and  are  modeled  by  the  relation,  for  all  ti  £  T: 

Z*(ti)  =  H(ti)Z(ti)+v(ti)  (3.8) 

where  Z*  is  an  m- vector  discrete-time  measurement  process,  one  sample  of  which 
provides  a  particular  measurement  time  history:  Z*(ti,u)  =  Z*  would  be  the  mea- 
surement values  that  are  available  at  time  In  equation  (3.8),  H(.)  is  an  m  x  5n 
matrix  (measurement  matrix)  that  defines  the  relationship  between  a  measurement 
and  its  corresponding  state  variable  and  v(.)  is  an  m- vector  discrete-time  white  Gaus- 
sian noise  with  statistics,  for  all  ti,tj  G  T: 

E[v(U)]  =  0  (3.9) 

£[vMv^)]  =  {*«■>  £  l;l  (3.io) 

where  R(^)  is  an  m  x  m  symmetric,  positive-definite  matrix  for  all  U  G  T.  In  general, 
R  defines  measurement  error  variance  associated  with  a  particular  measurement.  In 
this  study  it  is  further  assumed  that  Z(.)  and  v(.)  are  independent. 

In  groundwater  monitoring  practice,  it  is  quite  common  that  concentration 
observations  are  relatively  easy  to  obtain,  yet  Darcy  flux  is  not  directly  observable. 
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Here  it  is  assumed  that  only  the  solute  concentration  is  measured  at  m  discrete 
locations  in  domain  D  at  some  discrete  time  ti,  therefore  we  have 


where  x*  denotes  the  spatial  location  at  which  measurement  is  taken;  C*(x*,ti)  is 
an  mj-vector  composed  of  discrete  concentration  measurements  taken  at  time  ti\ 
vc(x*,ij)  (a  zero-mean  m^- vector)  accounts  for  the  measurement  error  introduced  in 
the  sampling  process  whose  covariances  satisfy  the  conditions  specified  in  equation 


The  concentration  measurement  matrix  Hc(^)  has  a  dimension  of  rrii  x  n  at 
time  t{  and  it  specifies  how  each  discrete  concentration  measurement  c*(x*,fj)  is 
related  to  the  true  concentration  field  c(x,  t).  Since  measured  concentration  often 
represents  certain  spatial  average  over  a  confined  volume,  Hc  defines  the  necessary 
volume  averaging  operation.  If  the  measured  concentration  is  assumed  to  be  a  point 
value,  Hc  represents  a  delta  function  operation. 

Before  ending  this  section,  there  are  two  points  that  are  worth  of  being  ad- 
dressed. First,  although  the  state  vector  Z  includes  only  concentration,  Darcy  flux, 
and  residual  NAPL  content,  the  above  analyses  can  be  easily  extended  to  cover  vari- 
ables such  as  hydraulic  conductivity,  head,  and  recharge,  which  are  often  of  great 
interest  in  problems  of  flow  and  solute  transport.  Second,  in  the  case  that  measure- 
ments of  flux  ^,2  =  1,2,3,  and  volumetric  NAPL  content  become  available  in  addition 
to  concentration  measurements,  a  complete  measurement  matrix  H(^)  would  read  as 
follows: 


C'(: 


x*,*i)  =  Hc(«i)C(x,ii)+vc(x*,ti) 


(3.11) 


(3.10). 


Hc  0 

0  H, 

0  0 

0  0 

0  0 


0 
0 
H 

0 

o 
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(3-12) 
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where  at  time  U  6  T,  the  dimensions  for  matrices  Hqi ,  H92,  H93,  and  Hy  are  mqii  x  n, 
mq2i  x  n,  mQ3i  x  n,  and  myi  x  n,  with  mqii,  mq2i,  mq3i,  and  myi  representing  the  number 
of  measurements  available  for  qi,  q2,  93,  and  Y,  respectively. 

3.2.2    Moment  propagation  equations 

The  propagation  of  conditional  moments  between  two  adjacent  measurement 
times  is  depicted  in  Figure  3.1.  Assuming  that  concentration  measurements,  repre- 
sented in  matrix  notation  as  C*_x  and  C*  are  available  at  times  and  U,  respec- 
tively, conditional  moments  at  time  tt1  represent  the  moments  conditioned  on  all 
measurements  taken  through  time  £j_i.  These  moments  are  then  propagated  forward 
with  time  until  new  measurements  become  available  at  time  U. 

The  derivation  of  unconditional  moment  equations  from  equations  (3.1)  to 
(3.5)  using  a  first-order  stochastic  perturbation  method  was  addressed  in  the  last 
chapter.  These  same  equations  are  applicable  for  conditional  moment  propagation 
between  measurement  times  in  the  state  estimation  problem.  However,  the  lineariza- 
tion in  this  case  is  performed  about  the  most  recent  conditional  means  of  state  vari- 
ables rather  than  the  unconditional  means  referenced  previously  [62].  Therefore,  all 
of  the  stochastic  partial  differential  equations  for  conditional  moment  propagation 
presented  here  are  piecewise  continuous  within  the  semiopen  time  interval  [tn-i,tn) 
which  is  left-bounded  by  the  measurement  time  tn_i  indicating  that  the  current  state 
has  been  conditioned  on  all  measurements  taken  through  time  tn_x. 
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Equations  for  the  conditional  means  of  the  state  variables,  i.e.,  c,  g»,  i  =  1, 2,3, 
and  In  9n  are  listed  as  follows: 


+ 


d_ 

dxi 
d_ 
dxi 

dxi 


dt 

[gi(x|tn_i)c(x,t|tn_i)] 

[P,.c(x,x,t|i„_i)] 
■  d5(x,^„_i)- 

x  G  D,    tn_i  <t  <tn 


(3.13) 


c(x,*|f„_i)  =  c(x,tn_i|t„_i) 


x  G  D    t  =  t„_i 


(3-14) 


^(xlin-ij^x^ltn-i)  +  Pgic(x,x, £!*„_!)  -  6>WDU 


dc(x,*ltn_x) 


^(xltn-Oo,  x  6  9Dm,  0  <  <  i  <  tn  <  tb 
0  x  G  <9Dm,    £6  <  t„_i  <t<tn 


(3.15) 


dc(x,t|tn_!) 


=  0    x  G  dDout,    tn-i  <t  <tn 


(3.16) 


9c(x,t|tw_x) 
dx2 


=  0    x  G  <9D„,    tn_i  <t<tn 


(3.17) 


dc(x,*|tw_i) 
9x3 


=  0   x  G  <9DZ,    in_!  <t  <tn 


(3.18) 


dgi(x|tn_i) 
dt 


=  0,       for   i  =  1,2,3. 


x  G  D    tn_!  <t  <tn 


(3.19) 


d?y(x|£n_i) 


=  0 


X  G  D     tn-.i  <t<tn 


(3.20) 


The  hats  over  the  state  variables  indicate  the  meaning  of  "best  estimates," 
and  t(,t\tn-i),  q^ltn-i),  and  y{\tn-i)  represent  the  best  available  estimates  of  c,  y{, 
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and  ln#„,  respectively,  given  all  measurements  collected  through  time  tn_x.  Equation 
(3.19)  and  equation  (3.20)  imply  that  the  parameters  q{,  i=l,2,3,  and  y  do  not  change 
between  measurement  times.  The  initial  condition  for  equation  (3.13)  within  the 
semiopen  time  interval  [£n_i,£n)  is  defined  as  the  conditional  concentration  mean  t 
based  on  all  the  measurement  through  time  tn-\. 

The  vector  Pg,c(x,  £|in_i)  in  equation  (3.13)  is  the  best  estimate  of  macrodis- 
persive  flux  in  the  z-th  direction  which  is  the  conditional  zero-lag  covariance  between 
the  z-th  component  of  Darcy  flux  and  concentration  at  location  x,  given  all  measure- 
ments collected  through  time  tn-\. 

The  quantity  P9iC(x,  x,  f  |£„_i)  can  be  obtained  from  solving  a  stochastic  partial 
differential  equation  for  cross-covariance  P9iC(x',x, t)  between  Darcy  flux  (fc(x')  at 
location  x',  and  concentration  c(x,  t)  at  location  x'.  The  equation,  derived  via  a 
first-order  perturbation  method  in  the  last  chapter,  has  the  form: 


dPqiC{x',x,t\tn-i) 


dc(x,t|£w_i) 
Of 


~  c(x)  t\tn-l)  Pqiq]  (*■' ,  X|i„-l) 

for    i  =  1,  2,  3. 


(3.21) 


x',x  G  D,    tn_i  <t  <t 


(3.22) 


x',  x  G  D,     t  =  tn_i 


gi(x|tn_1)P,iC(x',  x,  £|£„_i)  +  Pmi  (x',  x|tn_i)c(x,  *|t„_i) 
d 


ewDx 


P jJc(x  ,  x,  £|£n_i) 


1  dx\ 

P9i9l(x',x|£n_i)q,      x  G  9Dm,    x'  6  D,  0  <  £„_i  <  t  <  tn  <  tb  (3  23) 

0  x  6  dDm,    x'  G  D,  tb  <  tn-i  <t  <tn 

for    2  =  1,2,3. 


P9,c(x',  x,  £)  =  0,    x  G  dBout,    x'  G  D,    tn_i  <t<tn 


dxj  (3.24) 

for    i  =  1,2,3. 


P9,c(x',  x,  £)  =  0,    x  G  dBy,    x  G  D,    £n_j  <  t  <  tn 


dx2  (3.25) 

for    z  =  1,  2,  3. 


PgjC(x',  x,  £)  =  0,    x  G  dD2,    x'  G  D,  <t  <tn 


dx3  (3.26) 

for    t  =  1, 2, 3. 

Note  that  covariances  P9iC(., .,  £|£n-i,  Pg<9  (.,  .|£„_i),  and  P9l!/(., -\tn-\)  represent  the 
best  available  estimates  of  PqiC,  PqiQj  and  PqiV,  respectively,  given  all  measurements 
collected  through  time  £n_i.  Notice  that  a  closed  time  interval  [£„_!,£„]  is  designated 
for  the  boundary  condition  (equation  (3.23))  because  the  injection  concentration  Cb 
has  been  assumed  to  be  known  perfectly  throughout  the  entire  history  of  the  transport 
process.  To  solve  for  P9iC  with  equations  (3.21),  (3.22),  and  (3.23),  knowledge  of 
covariances  P9i9j  and  PQiy  is  needed.  These  covariances  are  obtained  through  the 
following  equations,  for  a  semi-open  time  interval  [£n-i,£n), 

dPn  n  fx',  x|£n_i ) 

Q'qA   '   '        =0,       for        =  1,2,3.  , 

dt  ,J  (3.27) 

x',xGD  £„_i<£<£„ 


9Pn  l/(x',  X  £rj_1  ) 

g'vK   '   '        =  0,       for    i  =  1,2,3. 

dt  (3.28) 

x',x  G  D    £„_!  <t<tn 
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Equation  (3.27)  and  equation  (3.28)  imply  that  the  conditional  covariances  Pq%qj  and 
Pqiy  remain  unchanged  between  measurement  times. 

The  concentration  mean  equation  also  involves  the  temporal  derivative  of  the 
conditional  cross-covariance  Pyc(x,  x,  i|£„_i)  which  accounts  the  correlation  between 
ln#n  (or  y),  evaluated  at  location  x,  and  concentration  c,  evaluated  at  the  same 
location  x,  conditioned  on  all  the  measurements  collected  through  time  £n_i.  This 
term  can  be  obtained  by  solving  the  moment  propagation  equation  for  Pyc(x',  x,  £|£n_i) 
in  the  semiopen  time  interval  [tn-i,tn).  This  equation  was  derived  in  Chapter  2  for 
continuous  time  t,  and  is  now  rewritten  in  the  following  form: 


{9W  +  KNe^^)dPyc^^tn-x) 


-(KNe^-^)Pyy(x',x\tn^) 
d 


dc(x,t\tn-i) 


dt 


+ 


_d_ 

dx3 
d_ 
dxi 


<Zj(x|£n-l)-Fj/c(x',  x)  *|£n-l) 
@w-Djk  -fyc(x  i  x>  ^l^n  — l) 


c(x,  t\tn_i)  Pyqj  (x',  x|£n_i) 
for    i,j  =  1,2,3 
x',x  G  D,       £n_!  <t<tn 


(3.29) 


-fyc(x  ,  X,  t\tn—\)  — PyC(X  ,  X,  tn—\  \tn—\) 

x',xgD,  t-tn_x 


(3.30) 


g1(x|£n_i)Pj/c(x',  x,  t\tn_i)  +  Pyqi  (x',  x|£n_!)c(x,  £|£„_i) 
d 


9WD\ 


1  dxi 


Pycfa-  -i  X,  £|£ti— i) 


Pyqi(x',x\tn^i)cb  x  G  dDin,  x'  G  D,  0  <  £n_!  <  t  <  tn  <  tb  ^  31\ 
0  x  G  dDm,     x'  G  D,     tb<  in_!  <  t  <  tn 

for    i  =  1,2,3. 
—  Pj,c(x', x,  t)  =  0,    x  G  <9Dout,    x'  G  D,    t„_i<£<  £n 

(3.32) 

for    i  =  1,2,3. 
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d 
dx2 


P„c(x',  x,  t)  =  0,   x  G  dDj,,    x'  €  D,    in_!  <  £  < 
for    i  =  1,2,3. 


(3.33) 


—  Pyc(x\x,<)  =0,    xedDz,    x'GD,  *„_i<*<*„ 

5x3  (3.34) 

for    i  =  1, 2,3. 

As  discussed  before  for  P9lC(x, x', i|i„_i),  P,,c(x, x',  t\tn-i)  represents  the  best 
available  estimate  of  Pyc(x,  x',£)  based  on  all  the  measurements  collected  through 
time  tn-\. 

The  propagation  equation  for  conditional  concentration  covariance  Pcc(x',  x,  t) 
at  two  different  locations  x'  and  x  in  the  semiopen  time  interval  [tn_i,in)  is  also 
derived  using  the  first-order  perturbation  method  discussed  in  Chapter  2.  Again 
Pcc(x',  x,  t\tn-i)  is  understood  to  be  the  best  available  estimate  of  Pcc(x',  x,  t)  based  on 
all  measurements  available  through  time  tn-i.  The  concentration  covariance  equation 
is  written: 


-(^e^l'»-))PC!,(x')x,t|tn_1) 
d 


c{x.,t\tn-i) 

dt 


+ 


dxi 
A 

_d_ 

dxi 


9i(x|<n_i)Pcc(x/,x,t|tn_i) 


c(x,  t\tn_i)Pcqi(x',  x,  £|t„_i) 
(^e^'lt-))Pq,(x,x',*|tn_1)a(x/,*l*n-l) 


ft 


+ 


d_ 

JL 

d_ 

dx'  L 


(3.35) 


^(x'|tn_i)Pcc(x,x',«|tn_1) 

QwDij-Q^j  Pccfa,*.  it\tn_i) 

c(x,,t|tB_i)PCii(x,x'ft|tll_i) 
for    i,j  =  1,2,3 
x',x  G  D,    in_t  <t<tn 
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Pcc(x  i  x,  i|tn_i)  —  Pcc(x  ,  X,  tn—\  |^n_i) 

(3.36) 

x',xGD,    t  =  tn_i 


qx  (x|in_i)Pcc(x',  x,  t\tn-i)  +  Pcgi  (x',  x,  ^„_i)c(x,  t\tn-X) 


8wDn—Pcc(x.',x,t\tn_i) 
ox  i 


I 


^(x'.Xjtlfn-iJct  x  e  9Dm,  x' G  D,  0  <  tn_!  <  t  <  in  <  tb 
0  x  €  <9Dm,    x'  G  D,    ^  <  tn_!  <t<tn 


(3.37) 


91(x'|in_1)Pcc(x,  x',  *|«„_i)  +  Pcgi  (x,  x',  i|tn_!)c(x',  t\tn-i) 

-  6wDn—Pcc{x,  x',  t\tn-i) 

Pen(x,x'>t|tB_1)cfc  x'e3Din,  xGD,  0  <  <  i  <  <„  <  i6  (3-38) 
0  x'  G  dDm,    x  G  D,    *6  <  tn-i  <t<tn 


Pcc(x',  x,  t)  =  0,    x  €  dBout,    x'  G  D,    t„-i  <  t  <  tn  (3.39) 


dxi 
d 


0 


T  Pcc(x,  x',  t)  =  0,    x'  G  dD0Ut,    x  G  D,    tn_!  <  i  <  tn  (3.40) 


<9x2 
9 


Pcc(x',x,<)  =  0,    x  6  9DS,    x' G  D,    t„_i<*<<„  (3.41) 


7  Pcc(x,  x',  t)  =0,    x'  G  3Dy,    x  G  D,    in_i  <  t  <  t„  (3.42) 


dx2 


5 

—  Pcc(x',  x,  t)  =  0,    x  G  9D2,    x'  G  D,    tn_i  <t<tn  (3.43) 


—  Pcc(x,  x',  t)  =  0,    x'  G  9D2,    x  G  D,    i„_!  <  t  <  tn  (3.44) 

ox3 


Equations  (3.13)-(3.44)  constitute  a  system  of  coupled  stochastic  partial  differ- 
ential equations  which  describe  the  propagation  of  conditional  moments  of  the  state 
variables  between  measurement  times  tn-i  and  tn.  The  structure  of  these  equations 
involve  the  same  advection  and  dispersion  mechanism  and  may  be  solved  numerically 
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in  the  same  fashion  used  for  the  unconditional  moment  equations  illustrated  in  the  last 
chapter.  Since  the  conditional  covariance  equations  are  first-order  approximations, 
their  solution  accuracy  depend  largely  on  the  presumed  smallness  of  the  variability 
of  the  unconditional  random  fields  under  consideration.  However  the  dependence  of 
the  small  perturbation  assumption  for  the  conditional  moment  propagation  equations 
can  be  reduced  by  increasing  the  number  of  measurements  [62]. 

3.2.3    Moment  update  equations 

The  moment  propagation  equations  have  been  summarized  in  Section  3.2.2. 
Moments  of  state  variables  based  on  the  entire  history  of  measurements  to  time  £„_i 
are  propagated  forward  to  the  next  time  tn  at  which  new  measurements  become 
available.  The  process  by  which  these  moments  are  updated  based  on  the  new  mea- 
surements at  time  tn  is  called  conditioning.  Note  that  the  following  discussion  is 
based  on  the  assumption  that  only  concentration  is  measured. 

The  equations  for  the  updated  conditional  means  for  the  state  variables  at 
time  tn  are 

6(x,  tn\tn)  =  6(x,  <„|<„_i)  +  Kcc(x,  t„)c(x,  tn)  (3.45) 
4i(x,t„|t„)  =  qj(xi*nl*n-i)  +K9jC(x,tn)c(x*,t„)  (3.46) 
f(x,tn\tn)  =  y(x,£„|t„_i)  +  KyC(x,tn)c(x*,tn)  (3.47) 

where 

c(x*,  tn)  =  c*(x*,  tn)  -  Hc(in)e(x,  tn\tn-i)  (3.48) 

As  indicated  in  equation  (3.48),  c(x*,tn)  is  the  residual  concentration  between  the 
measured  concentration  c*(x*,£„)  and  the  propagated  concentration  c(x*,  fn|i„_i)  at 
measurement  location  x*.  The  residual  is  a  measure  of  the  error  in  the  filter's  predic- 
tion and  can  be  used  to  update  the  estimate  not  only  at  the  measurement  locations 
but  everywhere  in  the  domain  based  on  the  concentration  correlation  structure. 
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The  Kalman  gains  Kcc,  K9iC,  and  Kyc  in  equations  (3.45)  to  (3.47)  specify 
how  much  weight  (or  confidence)  each  measurement  residual  carries  in  computing  the 
updated  states,  parameters,  and  covariances.  The  Kalman  gains  are  evaluated  from 
the  following  matrix  equations  [62]: 


Kcc(x,  tn)     KCgt(x,  £n)     KCy(x,  fn) 

Kg-c(")^7i)     ^9j9i(^)^n)     ■Kgj.y(x,  in) 

KyC(x,  £n)  (x,  in)    Kyy(x,  £n) 


R-cc(^n)       R-cqi(^n)  R-cj/(^n) 
I^gjc(^n)     ^■qjqii.tn)  ^■qjyij'n) 
P-j/c^n)      ^j/9i(^n)  Pj/J/^n) 


+ 


Hc(t„)   0  0] 


HT(*n) 

0 
0 


P(x(X)  X  ,  £n|fn_ i)       Peg;  (xi  X  ,  tn \tn—\ )     PCy(x,  X  ,  £n|£n_i) 


P<7jc(x>  ^  i  tn  |^n— l)  Pgj9i(x>x  l^n— l) 
Pyc(xi  X  ,  t,j 1 )      Pyg,  (xi  x  |^n— l) 


P9jy(x,  x'|tn_i) 
Pj/!/(X>X  l^n— l) 


(3.49) 


Pcc(x>  X  ,  tn \tn—  i  )      PCg;  (x,  X  ,  in  |tn_i )     Pcy  (x>  X  ,  tn \tn—\ ) 


P<7jc(x>  x  ,  in |^n—  1 )  P<7j<7t(xix  |^n  —  l) 
Pj/c(Xi  x  i  ^n|^n— l)      Pyg,  (Xj  x  |^n  — l) 


P<7j!/(X>X  l^n— l) 
Pj/J/(X'X  |^n— l) 


0 

_  0 

where  i,  j  =  1,2,  3,  and  matrix  R(£„)  is  measurement  error  covariance  associated  with 
the  measurement  defined  by  equation  (3.10).  And  the  updated  covariance  matrix  can 
be  obtained  via  the  following  matrix  equation  [62]: 

Pcc(xj  x  ,  tn \tn)     Peg;  (xj  X  ,  tn \tn)    Pc^ (x,  x  ,  £n  |£n) 
Pgjc(xj  X  ,  tn \tn)     Pgjgj(x,  X  |in)         P <jj-y(x,  X  |£n) 
P?/c(x>  X  ,  ^n|^n)      Pj/9i  (X'  X  l^n)  P?/2/(X'  X  l^n) 

Pcc(x>  X  ,  £n l^n—  1 )      PC9t  (x,  X  ,  inl^-i)     Pc^(x,  X  ,  t71|^n_i) 
Pqrjc(X5  x'j  ^n|^n-l)     P^g,  (Xi  x' 1 )         P?j  y  (x,  x'  |^n-l ) 


[  Hc{tn)     0     0  ] 


Pyc(xi  x  )  ^n|^n— l)      Pj/9i  (x>  x  l^n— l)  Pj/J/ (X'  X  l^n— 1 ) 
Kcc(x,  in)     Kc^(x,  f„)     Kc^(x,  tn) 

I^9jc(X)  ^n)  (x,  t„)     Kq,j.j,(x,  tn) 

K^c(x,  £n)     Kj^(x,  fn)     Ky^(x,  in) 

Pcc(xi  x  i  tn\tn-l)      Peg;  (xi  x  i  tn\tn-l)  Pcj/(X)  x'j  ^n|^n-l) 

Pg.,c(x>  x'5  ^nl^rc-l)     P^g^X:,  x'|^ri-l)  P^j/  (x,  x' \tn- 1 ) 

Pyc(xi  x  )  tn  |^n—  1 )      P?/gi  (x>  x  I ^rt— 1 )  Pj/J/ 

(x,x'|t„-l) 


(3.50) 
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3.2.4    Filter  initialization 

As  already  pointed  out,  in  a  discrete  measurement  time  domain,  the  moment 
propagation  equations  are  piecewise  continuous  and  defined  in  a  semi-open  time  in- 
terval [£„_!,£„).  This  means  that  the  system  can  be  repeatedly  initialized  with  the 
state  variables  and  covariances  updated  on  the  measurement  history  through  time 
tn-\.  However,  at  the  start  of  this  recursive  procedure  when  measurements  are  yet 
to  be  encountered,  one  must  define  the  initial  values  for  all  of  the  first  and  second 
moments  of  the  state  variables.  Since  it  is  assumed  that  tracer  does  not  exist  in  the 
domain  prior  to  the  injection,  it  is  reasonable  to  infer  that  the  initial  concentration  is 
equal  to  zero  without  uncertainty  at  time  t0,  which  follows  that  the  all  of  the  second 
concentration  moments  are  zero.  The  initialization  of  the  concentration  moments  for 
the  Kalman  filter  is  thus  summarized  as  follows: 


where  c(x,  ^ol^o)  denotes  the  conditional  concentration  mean  at  time  t0.  Similarly, 
the  unconditional  first  and  second  moments  of  the  parameters,  i.e.,  qi,  i=l,2,3,  and 
\n9n  (or  Y),  are  used  until  they  are  updated  using  the  first  available  measurements. 
The  initialization  for  the  parameters  are  summarized  as  follows: 


c(x,t0\tQ)  =  0 


(3.51) 


Pcc( 


x,x',to|*o)  =  PqiC(x,jd,tQ\t0)  =  Pyc(x,x',t0|*o)  =  0,    for  i=l,2,3. 


(3.52) 


9i(xl*o)  =  ft(x),    for  1=1,2,3 


(3.53) 


y(x\t0)  =  y(x) 


(3.54) 


and 


PQiqj(x,x'\tQ)  =  Pqiqj(x,x'), 


for  i,j=l,2,3 


(3.55) 


Pqty(x,x'\t0)  =  PQiy(x,x'), 


for  i=l,2,3. 


(3.56) 


Pyy(X,X'\t(j)     —  Pyy(X,X') 


(3.57) 
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where  is  the  unconditional  mean  for  the  flux  components  in  the  i-th  direction, 
i=l,2,3,  and  y  is  the  unconditional  mean  of  \ndn.  The  unconditional  covariances 
Pmj  (x,  x'),  i,j=l,2,3,  can  be  derived  from  a  three-dimensional  steady-state  flow  equa- 
tion using  a  spectral  analysis  [58,  57].  This  is  based  on  the  assumption  that  the 
hydraulic  conductivity  field  is  a  stationary,  statistically  isotropic  random  field  whose 
covariance  structure  can  be  represented  by  a  negative  exponential  function.  A  de- 
tailed discussion  on  the  derivation  of  the  flux  covariances  is  presented  by  Rubin  and 
Dagan  [106].  Expressions  of  PQiqj(x,  x')  are  listed  in  Appendix  A. 

In  order  to  determine  the  unconditional  covariances  of  fra(x,  x')  and  P9ij/(x,  x'), 
i= 1 ,2,3,  both  the  covariance  structure  of  the  unconditional  random  field  ln#„  and  the 
relationship  between  random  fields  In  6n  and  log  hydraulic  conductivity  must  be  speci- 
fied. In  the  last  chapter,  a  linear  relationship  between  the  unconditional  random  fields 
of  logarithm  of  NAPL  content  In  9n  and  logarithm  of  hydraulic  conductivity  In  K  was 
proposed,  i.e., 

\n6n  =  y  +  8y  =  y  +  (pf  +  y/l  -  p*  V  (3.58) 

where  y  and  6y  represent  the  mean  and  perturbation  of  random  field  \n9n,  respec- 
tively. The  constant  £  is  the  ratio  of  the  standard  deviation  of  the  random  In  9n  field  to 
that  of  the  random  In  A'  field  ,  i.e.,(  =  cry/af,  in  which  /  represents  the  perturbation 
components  of  natural  log  hydraulic  conductivity  field.  Equation  (3.58)  states  that 
the  log  hydraulic  conductivity  is  likely  to  be  correlated  to  the  log  volumetric  NAPL 
content,  but  that  such  relationship  is  unlikely  to  be  perfect.  Therefore  an  additional 
random  field  77  is  included.  Coefficient  p  is  a  weighting  factor  (  — 1.0  <  p  <  1.0)  which 
determines  the  degree  of  the  correlation  between  the  random  fields.  Equation  (3.58) 
represents  a  positive  (negative)  correlation  between  In  AT  and  ln#„  when  p  is  greater 
(less)  than  zero,  which  means  that  more  NAPL  is  expected  to  occur  in  regions  where 
hydraulic  conductivity  is  large  (small).  Note  that  when  p  =  0,  equation  (3.58)  repre- 
sents uncorrelated  In  AT  and  ln0n  fields;  and  when  p  =  ±1,  equation  (3.58)  represents 
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perfect  positive  (negative)  correlation  between  these  random  fields.  Expressions  of 
Pyy(x,x.')  and  P9iJ/(x, x')  derived  from  equation  (3.58)  are  listed  in  Appendix  (A). 

3.2.5    Solution  methods 

The  moment  propagation  component  of  the  Kalman  filtering  algorithm  in- 
volves solving  a  system  of  six  coupled  partial  differential  equations  defined  over 
semiopen  time  intervals  bounded  by  measurement  times.  These  equations  are  solved 
numerically  using  a  seven-point,  central-differencing  in  space,  and  fully  implicit  in 
time,  finite  difference  scheme  (see  Appendix  E).  The  three-dimensional  domain  is 
discretized  into  a  number  of  uniform  blocks.  Since  the  concentration  covariance  equa- 
tions generally  account  for  covariances  evaluated  at  two  spatial  locations,  i.e.,  x  and 
x',  a  square-root  decomposition  procedure  is  used  to  transform  these  equations  into 
their  equivalent  square-root  form  which  involves  only  one  spatial  location  vector  x 
following  the  procedure  described  by  Maybeck  [82]  (see  Appendix  B).  At  time  t0, 
the  initial  covariance  matrix  is  symmetric  and  positive-definite,  and  its  square  root 
can  be  found  with  the  use  of  the  Cholesky  decomposition  method  [46]  which  factors 
the  original  matrix  into  the  product  of  a  lower  triangular  matrix  and  its  transpose 
(see  Appendix  D).  Use  of  the  square-root  method  provides  the  advantage  of  reduced 
data  storage  and  improved  computation  precision  and  stability  [82].  The  discretized 
finite  difference  equations  are  solved  via  a  LSOR  matrix  solver.  Concentration  break- 
through data  from  a  partitioning  tracer  provides  information  on  both  NAPL  and 
Darcy  flux,  while  a  nonpartitioning  tracer  provide  information  for  Darcy  flux  only. 
Therefore  maximum  information  can  be  gained  by  utilizing  concentration  measure- 
ments from  both  partitioning  and  nonpartitioning  tracers.  Incorporating  concentra- 
tion measurements  of  both  partitioning  and  nonpartitioning  tracers  in  the  filtering 
algorithm  can  be  done  through  conditioning  in  either  a  simultaneous  or  a  sequential 
manner.  In  the  simultaneous  conditioning  approach,  the  general  extended  Kalman 
filter  must  be  slightly  modified  so  that  both  partitioning  tracer  concentration  and 
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nonpartitioning  tracer  concentration  are  included  as  state  variables  at  the  same  time. 
Simultaneous  conditioning  requires  increased  data  storage  and  CPU  time  as  the  size 
of  the  covariance  matrix  increases  linearly  with  the  square  of  the  number  of  state 
variables. 

A  significant  reduction  in  computational  burden  can  be  achieved  by  the  adop- 
tion of  the  sequential  filering  approach  which  deals  with  one  tracer  at  a  time  to 
constitute  a  five-state  variable  filter  rather  than  a  six-state  variable  filter  in  the  simul- 
taneous conditioning  approach.  The  sequential  approach  assumes  that  nonpartioning 
tracer  concentration  measurements  contain  the  highest  quality  information  for  flux 
estimation  and  therefore  further  conditioning  of  flux  field  with  partitioning  tracer 
concentration  measurements  should  not  produce  significant  improvement  in  the  flux 
estimates.  The  sequential  filtering  approach  is  adopted  here  and  follows  these  steps: 
first,  concentration  measurements  for  a  nonpartitioning  tracer  are  used  to  update 
the  Darcy  flux,  and  the  updated  flux  covariance  matrix  is  saved;  second,  with  the 
updated  flux  and  covariance  matrix  as  input,  concentration  measurements  of  a  par- 
titioning tracer  are  used  to  update  only  the  NAPL  content.  The  structure  for  the 
sequential  filtering  algorithm  is  illustrated  in  Figure  3.2. 

Output  of  the  Kalman  filtering  algorithm  includes  (1)  updated  first  moments, 
which  are  the  best  estimates  of  the  state  variables;  (2)  updated  second  moments, 
which  provide  a  measure  of  the  estimation  uncertainty  and  quantify  the  cross-correlations 
between  the  state  variables. 

3.3    Synthetic  Case  Studies 

In  this  section,  several  synthetic  examples  of  three-dimensional  transport  of 
sorbing  solute  in  NAPL  contaminated  aquifers  are  investigated  to  illustrate  the  per- 
formance of  the  Kalman  filtering  algorithm.  Since  the  main  aim  is  to  produce  a 
site-specific  prediction  of  solute  movement  in  subsurface  environment  together  with  a 
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Nonpartitioning  tracer  simulation       Partitioning  tracer  simulation 


Prior  Darcy  flux 
moments: 


Initial  concentration  moments: 
(nonpartitioning  tracer) 

Cn(*o)>  -Pcg.^o),  P :„y(*o))-P zncn(U) 


Prior  ln-NAPL  content 
moments: 

V,  Pyy,  Pyq, 


Initial  concentration  moments: 
(partitioning  tracer) 

Cp(t0),  PCpqi(to),  Pcpy{to),  PCpcp[to) 


Moment  propagation  algorithm:        —¥  tn 


Nonpartitioning 
tracer  concentration 
measurements:  c*(t„) 


partitioning 

tracer  concentration 

measurements:  c*(tn) 


Moment  updating  algorithm:  tn  —t  t+ 


t+\t 


Updated  Darcy  flux 
moments:  q{,  Pqiqj 
Updated  Concentration 
moments:Sn,  PCnCn,  PQiCn 


t+\t 


Updated  ln-NAPL  content 
moments:  y,  Pyy,  PQiy 
Updated  concentration 
moments:  cp,  PCpCp,  PyCp,  PQiCp 


Figure  3.2:  Structure  of  the  sequential  Kalman  filtering. 

reliable  estimation  of  aquifer  hydrogeochemical  parameters  (i.e.,  volumetric  residual 
NAPL  content  6n  and  Darcy  flux  components  qt  in  this  study),  use  of  synthetically- 
generated  data  sets  allows  the  direct  comparison  of  the  optimal  estimation  results  to 
the  "true"  values  which  are  known  perfectly. 
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The  three-dimensional  simulation  domain  utilized  in  the  synthetic  example 
problems  is  assumed  to  be  fully  saturated  by  water  and  a  small  amount  of  NAPL 
which  is  randomly  present  in  its  residual  phase.  A  non-partitioning  tracer  and  a  par- 
titioning tracer  are  simultaneously  introduced  at  time  t0  from  the  upstream  boundary 
and  are  carried  through  the  domain  by  a  steady-state  flow.  The  boundary  conditions 
for  this  transport  problem  have  been  discussed  in  the  previous  sections.  The  parame- 
ters used  in  generating  the  synthetic  data  set  are  based  on  the  field  tracer  experiment 
conducted  by  Annable  et  al.  [5]  and  are  given  in  Table  3.1.  A  turning  bands  algo- 
rithm is  used  to  generate  the  heterogeneous  In  hydraulic  conductivity  and  the  residual 
NAPL  content  fields.  The  In  hydraulic  conductivity  field  is  statistically  isotropic  and 
has  a  correlation  scale  (Ay)  of  0.3048  m  (1  ft)  and  variance  of  aj  =  1.0.  The  In  resid- 
ual NAPL  content  field  is  also  assumed  to  be  isotropic  with  a  correlation  scale  (A,,) 
of  0.3048  m  (1  ft)  and  variance  of  a2y  =  0.746.  The  ensemble  means  for  the  In  if  and 
ln#„  fields  are  2.84  and  —4.62,  respectively.  Sensitivity  of  the  correlation  between 
In  if  and  ln#n  on  the  optimal  estimation  is  investigated  by  varying  the  correlation 
coefficient  p  appearing  in  equation  (3.58). 

Once  the  hydraulic  conductivity  fields  are  simulated  using  the  Turning-Bands 
algorithm,  they  are  used  as  input  parameters  in  a  three-dimensional  finite-difference 
groundwater  flow  code  to  produce  the  steady-state  head  field  and  Darcy  flux  compo- 
nents. To  simulate  the  transport  of  tracers  and  produce  the  heterogeneous  concen- 
tration field,  the  three-dimensional  Darcy  flux  and  random  NAPL  fields  are  utilized 
in  the  traditional  advection-dispersion-retardation  equation.  This  equation  is  solved 
numerically  using  a  finite  difference  solver.  Tracer  concentration  "measurements"  at 
a  number  of  fixed  locations  are  "collected"  at  discrete  times  and  utilized  by  the  filter- 
ing algorithm  to  estimate  the  tracer  distribution  and  both  the  flux  and  the  residual 
NAPL  content  fields. 
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Table  3.1:  Input  parameters  for  conditional  simulations 


Parameter 


Value 


Domain  size 
Discretization 


Volumetric  water  content 
Mean  hydraulic  gradient 
Longitudinal  (transverse)  dispersivity 
Tracer  injection  concentration 
Duration  of  tracer  input 
Mean  hydraulic  conductivity 
Tracer  partitioning  coefficient 
In  K  standard  deviation 
In  K  correlation  length 
Unconditional  mean  of  In  9n 
Unconditional  In  9n  standard  deviation 
In  9n  correlation  length 

Unconditional  mean  of  Darcy  flux  components: 


Mean  of  r\ 

Standard  deviation  of  r] 
Correlation  length  of  rj 


3.556  m  x  3.556  m  x  1.524  m 
Nx  =  14,    Ny  =  7,    Nz  =  7 
Ax  =  0.254  m,  Ay  =  0.508  m, 
Az  =  0.218  m 

ew  =  o.2i 

J  =  0.05893 

on  (<>r)  =  0.1  m 

C(,  =  1.0  (Normalized) 

tf,  =  0.1  days 

Kg  =  17.1  m/day 

KN  =  9.0 

of  =  1.0 

Xf  =  0.3048  m 

y  =  -4.62 

oy  =  0.8637 

Xy  =  0.3048  m 

qx  =  1.008  m/day 

Qy  =  qz  =  0.0 

77  =  0 

0^  =  0.8636 
A„  =  0.3048  m 


3.3.1    Problem  definition,  boundary  conditions 

Figure  3.3  shows  the  three-dimensional  simulation  domain  D  (3.556  m  x  3.556 
m  x  1.524  m)  and  boundary  conditions  for  the  transport  problem.  Using  a  finite- 
difference  algorithm,  D  is  uniformly  divided  into  14  x  7  x  7  blocks,  over  each  of 
which  all  of  the  variables  of  interest  are  defined  as  volumetric  averages.  Tracers 
(partitioning  and  non-partitioning)  are  introduced  from  the  upstream  face  of  the  box 
at  a  fixed  concentration  cb  of  1.0  (which  is  dimensionless  upon  normalization)  for  a 
certain  period  of  time  tb  of  0.1  days.  The  volumetric  water  content  6W  is  assumed 
to  be  a  spatially  constant  value  of  0.21  and  does  not  change  throughout  the  solute 
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transport  process.  Note  that  the  local-scale  dispersivities  (aL  and  ar),  tracer  initial 
concentration  (cb),  injection  time  (tj,),  tracer  partitioning  coefficient  {KN),  and  the 
mean  hydraulic  gradient  (J)  are  all  assumed  to  be  deterministic  constants. 

The  sampling  network  (Figure  3.4a)  for  the  base  case  consists  of  twelve  mul- 
tilevel samplers  (MLS)  evenly  spaced  in  D.  Each  of  the  multilevel  samplers  provides 
five  samples  at  equally-spaced  vertical  locations,  making  a  total  number  of  sixty 
concentration  measurements  available  at  each  measuring  time.  To  gain  insight  on 
sampling  network  design  and  sampling  frequency,  two  other  hypothetical  sampling 
scenarios  are  also  considered:  (1)  twelve  multilevel  samplers,  generating  60  measure- 
ments each  time  (Figure  3.4b);  and  (2)  eighty- four  multilevel  samplers,  generating 
420  measurements  each  time  (Figure  3.4c). 

In  the  base  case,  random  fields  of  In  K  and  In  9n  are  assumed  to  be  statistically 
uncorrelated,  i.e.,  p  =  0.  Synthetic  cases  which  involve  weak  correlations  (p  =  ±0.25) 
as  well  as  perfectly  negative  correlation  (p  =  —1)  between  \nK  and  ln#„  fields  are 
also  studied  to  examine  the  impact  of  In  K-\n  9n  correlation  on  the  optimal  estimation 
results.  Table  3.2  summarizes  the  parameters  used  in  the  five  synthetic  cases. 

3.3.2    Conditional  simulation  results 

In  the  base  case,  the  random  fields  \nK  and  ln0n  are  assumed  to  be  sta- 
tistically uncorrelated,  i.e.,  p  =  0.  The  synthetic  concentration  measurements  of  a 
nonpartitioning  tracer  and  a  partitioning  tracer  (partitioning  coefficient  KN  =  9.0) 
are  collected  from  twelve  multilevel  samplers  whose  spatial  configuration  is  shown  in 
Figure  3.4a.  The  mean  and  covariances  of  the  state  variables  are  repeatedly  condi- 
tioned seven  times  based  on  a  total  of  60  measurements  taken  at  each  sampling  time 
(from  0.2  days  to  2.0  days  at  a  frequency  of  0.3  days.) 

The  random  replicate  of  In  hydraulic  conductivity  In  K  generated  by  a  three- 
dimensional  turning-band  program  at  horizontal  layer  Z2,  Z3,  Z4,  and  Z5  (located  at 
0.327,  0.544,  0.762,  and  0.980  m  above  the  base)  is  shown  in  Figure  3.5.  Figure  3.6 
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qxC-9wDx(dc/ax)=qxCb ,  0<t<0.1d 


3c/9y=0 

Figure  3.3:  Schematic  diagram  of  the  unconditional  simulation  domain  and  boundary 
conditions.  Twelve  multilevel  samplers  (MLSs)  generate  60  concentration  measure- 
ments at  each  measuring  time. 

shows  the  same  random  InK  field  at  vertical  slices  Y2,  Y4,  and  Y6  (located  at  0.762, 
1.778,  and  2.794  m  from  the  origin  in  y-direction,  respectively.)  The  realization- 
averaged  mean  and  standard  deviation  of  the  particular  realization  of  InK  are  2.818 
and  0.935,  which  are  very  close  to  the  unconditional  ensemble  mean  (F  =  2.839)  and 
ensemble  standard  deviation  (07  =  1.0),  respectively.  The  steady-state  flow  field  can 
be  calculated  by  solving  the  steady-state  flow  equation  numerically  with  an  iterative 
finite  difference  method,  and  the  components  of  Darcy  flux  were  estimated  using  a 
second-order  finite  difference  procedure  on  the  resulting  head  field.  Figure  3.7  shows 
the  x-y  component  of  the  Darcy  flux  vector,  i.e.,  qXJ/  =  +  qy,  at  the  horizontal 
layers  Z2,  Z3,  Z4,  and  Z5  resulting  from  this  random  conductivity  realization;  whereas 
Figure  3.8  shows  the  x-z  component  of  the  simulated  Darcy  flux  vector,  i.e.,  qI2  = 
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Table  3.2:  Input  parameters  for  the  conditional  case  studies. 


Case 


Parameter 

l(Base) 

2 

3 

4 

5 

f  (  nnrnnHitinnal ) 

J.         1   IX  1 1  V_A_J11AX  1  I;  1A7 11  <X1  J 

2  839 

2.839 

2.839 

2.839 

2.839 

rr  c  in  n  rnnni  t  inn  all 

1  000 

1  000 

J.  .UVJU 

1  000 

1  000 

X  .\J\J\J 

1  0 

Af 

0  3048m 

0  3048/77 

0  3048m 

0  3048m 

0  3048m 

Ti  MiTipnnHitinnal  1 
y  \  ix  ii^  wixux  tiunui  i 

—4  6193 

—4  6193 

—4  6193 

—4  6193 

*r    i  iinronrlitinnp  1  i 

0  8636 

0  8636 

0  8636 

0  8636 

0  8636 

Ay 

0.3048m 

0.3048m 

0.3048m 

0.3048m 

0.3048m 

C  =  Onlo  l 

0.8636 

0.8636 

0.8636 

0 

0 

0.8636 

0.8636 

0.3048m 

0.3048m 

P 

0 

0.25 

-0.25 

-1.0 

0 

Measurement  No. 

60 

60 

60 

60 

420 

Measuremen  error 

0.03 

0.03 

0.03 

0.03 

0.03 

Sampling  network 

Fig.  3.5 

Fig.  3.5 

Fig.  3.5 

Fig.  3.5 

Fig.  3.5 

+  Qz,  at  vertical  slices  Y2,  Z4,  and  Y6.  Note  that  the  arrow  legend  for  each  vector 
plot  represents  the  maximum  value  in  that  particular  layer. 

The  optimal  estimation  procedure  begins  by  using  the  nonpartitioning  data 
to  update  the  Darcy  flux  field.  The  propagated  moments  are  conditioned  based  on 
60  nonpartitioning  tracer  concentration  measurements  available  at  0.2,  0.5,  0.8,  1.1, 
1.4,  1.7  and  2.0  days.  Figure  3.9  is  the  vector  plot  of  the  x-y  component  q^  of  the 
estimated  Darcy  flux  after  the  final  conditioning  time,  at  horizontal  layer  Z2,  Z3, 
Z4,  and  Z5,  and  Figure  3.10  is  the  vector  plot  of  x-z  component,  qX2  =  q^.  +  qz,  at 
vertical  slices  Y2,  Y4,  and  Y4.  At  layer  Z2  and  Z3,  the  estimated  distribution  of  qX!/ 
implies  that  higher  conductivity  occurs  in  areas  near  the  outflow  boundary  and  lower 
in  areas  immediately  downstream  of  the  inflow  boundary;  at  upper  layers  Z4  and  Z5, 
the  Kalman  filter  was  able  to  distinguish  a  low  flux  area  in  the  center  of  these  layers. 
Comparison  of  the  estimated  Darcy  flux  to  the  simulated  flux  (Figures  3.7  and  3.8) 
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and  the  original  random  conductivity  field  (Figures  3.5  and  3.6)  indicates  that  the 
Kalman  filter  is  able  to  capture  the  dominant  feature  of  the  originally  simulated  flux 
distribution.  As  expected  the  estimated  flux  field  is  smoother  than  the  originally 
simulated  flux  field. 

Once  the  updated  flux  components  are  obtained,  they  were  used  in  the  sub- 
sequent simulation  of  the  partitioning  tracer  to  estimate  the  \n9n  field  based  on 
partitioning  tracer  concentration  measurements  at  the  same  measuring  times.  In 
the  partitioning  tracer  simulation,  flux  values  are  not  updated  again  since  the  flux 
estimates  from  the  nonpartitioning  tracer  should  already  incorporate  most  of  the 
available  information. 

Figure  3.11  compares  both  the  conditional  (c)  and  the  unconditional  (c)  mean 
concentration  distributions  to  the  synthetically-simulated  concentration  field  for  a 
horizontal  layer  Z4  (located  at  0.762  m  above  the  base  of  the  domain)  at  time  0.5, 
1.0,  and  1.5  days.  The  "true"  concentration  field  is  presented  as  color  image,  overlaid 
by  solid  and  dashed  contour  lines  representing  the  conditional  and  unconditional  mean 
concentration  distributions,  respectively.  As  pointed  out  in  Chapter  2,  unconditional 
simulation  does  not  generate  the  site-specific  distribution  of  solute  concentration  pro- 
file. Note  that  during  conditioning  a  measurement  noise  of  0.03  is  assumed  to  be 
associated  with  each  concentration  measurement  to  account  for  possible  measure- 
ment and  analytical  error  under  field  conditions. 

Figure  3.11  shows  that  the  conditional  mean  concentration  field  reproduces  the 
pattern  of  the  "true"  concentration  field  much  better  than  the  unconditional  mean 
concentration  field.  At  0.5  days  (after  two  times  of  conditioning  have  taken  place), 
the  conditional  mean  concentration  contours  already  becomes  very  tortuous  as  both 
Darcy  flux,  NAPL,  and  concentration  fields  have  been  updated.  Note  that  as  time 
evolves  and  more  measurements  utilized  by  the  conditioning  algorithm,  the  condi- 
tional mean  concentration  distribution  appears  to  increasingly  approach  the  "true" 
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solute  plume.  This  suggests  that  for  the  purpose  of  predicting  solute  plume  movement, 
repeated  use  of  concentration  measurements  is  always  needed  by  the  Kalman  filter- 
ing algorithm.  Figure  3.12  plots  the  contours  of  conditional  concentration  standard 
deviation  <rc(x)  which  provides  a  quantitative  measure  of  the  uncertainty  associated 
with  the  conditional  concentration  prediction.  Again  these  contours  are  spatially  tor- 
tuous compared  to  the  counterparts  from  the  unconditional  simulation  discussed  in 
the  last  chapter.  The  magnitude  of  conditional  concentration  deviation  ac  decreases 
as  the  contour  lines  concentrically  approach  the  sampling  points,  and  reduces  to  the 
measurement  error  at  measurement  locations. 

Figure  3.13  and  Figure  3.14  show  the  estimates  of  the  logarithm  of  volumetric 
residual  NAPL  content.  The  synthetically  generated  (or  "true")  random  \n0n  field, 
denoted  by  ln#„,  is  plotted  as  an  interpolated  color  image  while  the  estimates  ln#„  is 
shown  by  the  contour  lines  sumperimposed  on  the  image.  Shown  in  Figure  3.13  are 
images  and  contours  on  four  horizontal  layers  Z2,  Z3,  Z4,  Z5,  located  at  0.327,  0.544, 
0.762,  and  0.980  m  above  the  base  of  the  domain  D,  respectively,  while  Figure  3.14 
shows  vertical  slice  Y2,  Y4,  and  Y6,  located  at  y  =  0.762, 1.778,  and  2.794  m,  respec- 
tively. Sampling  locations  are  represented  by  black  dots.  These  figures  indicate  that 
the  algorithm  successfully  captures  the  dominant  pattern  of  the  NAPL  distribution. 
Areas  with  relatively  high  NAPL  content  in  the  upstream  portion  of  the  domain  are 
recovered  quite  well  by  the  algorithm.  The  values  of  estimate  \n6n  approach  the  prior 
(unconditional)  mean  (-4.619)  in  mosi  of  the  downstream  region  as  well  as  in  the 
near-boundary  areas  where  little  information  can  be  recovered  from  interior  measure- 
ments of  tracer  concentration.  This  supports  the  conclusion  made  in  the  study  of 
unconditional  correlation  pC!/(x,  x')  (Chapter  2)  that  concentration  measurement  of 
a  partitioning  tracer  taken  at  location  x  leads  to  information  concerning  NAPL  (at 
location  x')  primarily  upstream  of  x,  and  that  very  little  information  is  gained  down- 
stream of  the  measurment.  The  estimated  ln9n  distribution  has  a  mean  (y~  =  -4.547) 
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and  a  standard  deviation  (ay  =  0.176),  indicating  that  the  algorithm  produces  slight 
overestimation  compared  to  the  unconditional  mean  (y  =  —4.619).  Furthermore,  the 
algorithm  tends  to  overestimate  ln#n  in  areas  where  the  "true"  values  are  less  than 
the  unconditional  mean,  and  to  underestimate  in  areas  where  the  "true"  values  are 
greater  than  the  unconditional  mean.  It  is  possible  that  the  assumption  of  "small" 
variations  in  residual  NAPL  content  may  be  violated  in  these  areas,  causing  difficulty 
for  the  algorithm  to  produce  accurate  estimates. 

The  conditional  standard  deviation  of  ln8n,  i.e.,  ay(x)  is  regarded  as  a  mea- 
sure of  the  uncertainty  of  the  estimate  ln#„.  Corresponding  to  Figures  3.13  and  3.14, 
Figure  3.15  shows  ay(x)  at  four  horizontal  layers  Z2,  Z3,  Z4,  and  Z5,  while  Figure 
3.16  shows  the  three  vertical  slices  Y2,  Y4,  and  Y6.  These  figures  indicate  that  the 
magnitude  of  conditional  standard  deviation  ay  (or  uncertainty  of  \n6n)  decreases 
as  the  contour  lines  concentrically  approach  the  measurement  points  located  in  the 
upstream  area.  The  smallest  values  of  ay  at  the  upstream  sampling  locations  range 
between  0.52  to  0.56,  indicating  a  substantial  reduction  of  uncertainty  is  achieved 
by  conditioning  compared  to  the  unconditional  standard  deviation  oy  =  0.864.  The 
conditional  standard  deviation  ay  increases  quite  rapidly  downstream  and  approaches 
the  unconditional  standard  deviation  ay  at  the  last  row  of  the  MLSs.  This  again  indi- 
cates that  very  little  information  about  NAPL  content  can  be  recovered  downstream 
from  concentration  measurements  made  at  the  last  row  of  sampling  points. 

To  check  the  performance  of  the  Kalman  filtering  algorithm,  cumulative  den- 
sity functions  (cdf)  of  the  normalized  residual  of  both  conditional  (posterior)  and 
unconditional  (prior)  ln#n  are  compared  in  Figure  3.17,  also  shown  is  the  cdf  for 
a  standard  Gaussian  distribution.  The  conditional  normalized  residual  of  ln^„  is 
defined  as  [ln0n(x)  -  \n6{x.)]/ay(x),  while  the  unconditional  normalized  residual  of 
ln#n  is  defined  as  [ln0n(x)  -  In 8(x.)]/ay(x).  The  unconditional  distribution  of  the 
normalized  residual  of  ln#„  has  a  mean  of  -8.92  x  10~2  and  a  standard  deviation  of 
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0.93,  and  is  fairly  close  to  a  standard  Gaussian  distribution.  If  the  filtering  algorithm 
is  performing  accurately,  the  normalized  residual  should  have  a  zero  mean  and  unit 
standard  deviation.  The  distribution  of  the  conditional  normalized  residual  of  In  9n 
has  a  mean  of  —0.192,  and  a  standard  deviation  of  0.973,  and  is  reasonably  close  to 
Gaussian.  That  fact  that  the  mean  is  less  than  zero  indicates  that  the  residual  NAPL 
content  is  slightly  overestimated  by  the  stochastic  model. 

The  best  estimates  of  NAPL  content  and  Darcy  flux  shown  in  these  figures  are 
based  on  repeated  tracer  concentration  measurements  at  the  same  location.  For  ex- 
perimental design  purposes,  it  is  desirable  to  know  how  many  measurements,  collected 
at  the  same  location,  are  sufficient  for  the  inverse  algorithm  to  produce  acceptable 
quality  estimates  of  the  state  variables.  In  the  analysis  of  the  unconditional  corre- 
lations pC!/(x, x'),  and  pC9i(x,  x'),  i  =  1,2,3  (Chapter  2),  it  was  found  that  the  mag- 
nitude of  these  correlations  remained  almost  unchanged  with  time  over  the  course 
of  the  tracer  experiment.  Since  the  correlation  pc?/(x, x')  or  pC9i(x, x'),  defined  as 
PCj/(x,  x')/[<7c(x)ctj,(x')]  or  PC9i(x,  x')/[ac(x)a9i(x')]  respectively,  simply  measures  the 
degree  of  correlateness  between  the  In  NAPL  content  or  Darcy  flux  component  to 
be  estimated  at  location  x'  and  concentration  measurement  made  at  x,  this  time- 
invariance  suggests  that,  at  a  particular  location,  measuring  concentration  once  may 
provide  all  information  available  on  NAPL  and  Darcy  flux  estimation  in  the  upstream 
aquifer  volume  swept  by  tracer.  This  hypothesis  may  be  quantitatively  examined  by 
inspecting  the  evolution  of  the  conditional  correlation  pcy(x,  x')  (which  is  defined  as 
Pcy(x,  x')/[<tc(x)<Tj,(x')])  over  the  sequential  conditioning  process. 

To  examining  this  hypothesis,  a  simplified  conditional  simulation  based  on  par- 
titioning tracer  concentration  measurements  at  time=0.5  and  1.0  days  was  conducted 
and  the  correlation  pcy(x,x')  prior  to  and  immediately  after  the  measuring  times 
are  shown  in  Figure  3.18.  Note  that  x  represents  the  location  at  which  concentra- 
tion measurement  is  taken.  Prior  to  the  first  available  concentration  measurements  at 
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time=0.5  days,  contours  of  pcy(x,  x')  are  symmetric  since  the  unconditioned  In  6n  field 
is  stationary  and  Darcy  flux  field  is  yet  to  be  conditioned.  After  the  first  condition- 
ing process  at  time=0.5  days,  the  magnitude  of  pcy(x,x')  has  decreased  substantially 
from  that  of  the  unconditional  pcy(x,x')  ,  indicating  that  information  from  concen- 
tration measurements  has  caused  the  conditional  mean  \n9n  field  to  be  closer  to  the 
"true"  In  6n  field  than  the  prior  mean  was,  and  thus  In  9n  is  associated  with  reduced 
uncertainty.  As  pcy(x,x')  propagates  between  measurement  times,  the  sign  of  the 
correlation  changes  from  negative  to  positive  indicating  the  center  of  mass  passed 
by  the  sampling  location  x.  However,  the  magnitude  of  pcy(x,  x')  remain  essentially 
unchanged  between  0.5  and  1.0  days,  and  after  the  second  conditioning  process  at 
time=1.0  days,  supporting  the  hypothesis  that  little  additional  information  can  be 
obtained  from  further  measurements  made  at  previously  sampled  locations. 

Sensitivity  of  conditioning  frequency  on  the  quality  of  NAPL  estimates  was 
further  studied  by  varying  the  conditioning  frequency  and  subsequently  comparing 
the  resulting  statistics  of  ln#n  estimates.  In  additional  to  the  base  case  in  which  the 
moments  of  the  state  variables  were  conditioned  seven  times  at  a  0.3  days  frequency, 
two  other  similar  simulations  were  run:  (1)  moments  conditioned  4  times  at  a  0.6  days 
frequency,  (i.e.,  at  time=0.2,  0.8,  1.4,  and  2.0  days;)  and  (2)  moments  conditioned  3 
times  at  a  0.9  days  frequency,  (i.e.,  at  time=0.2,  1.1,  and  2.0  days.)  Statistics  of  both 
the  unconditional  (prior)  In  9n  and  the  conditional  (posterior)  In  9n  in  comparison  to 
the  "true"  data  for  all  three  simulations  are  summarized  in  Table  3.3.  Formulas  used 
in  the  table  are  defined  in  Appendix  C. 

Table  3.3  shows  that  the  In  9n  estimates  obtained  by  conditioning  at  different 
frequencies  exhibit  very  similar  statistical  behavior.  Under  the  assumption  of  uncor- 
related  In  A"  and  ln0„,  the  algorithm  performs  well  in  all  three  simulations,  yet  \n9n  is 
slightly  overestimated.  As  the  state  variables  are  conditioned  more  frequently,  slightly 
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more  reduction  of  estimation  uncertainty  is  observed  as  the  conditional  standard  de- 
viation for  the  estimation  residual  (0$)  decreases  from  0.800  for  0.9-day  conditioning 
interval  to  0.763  for  0.3-day  conditioning  interval.  Similar  behavior  is  also  observed  in 
Case  2  and  Case  3  in  which  random  fields  of  In  K  and  In  9n  are  assumed  to  be  weakly 
positively  correlated,  i.e.,  p  =  0.25,  and  weakly  negatively  correlated,  i.e.,  p  =  —0.25, 
respectively.  Statistics  for  Case  2  and  Case  3  are  summarized  in  Table  3.4  and  Table 
3.5.  As  the  state  variables  are  conditioned  more  frequently,  the  conditional  standard 
deviation  for  the  estimation  residual  (o^)  decreases  from  0.750  to  0.744  for  Case  2, 
and  from  0.754  to  0.747  for  Case  3,  respectively.  Note  that  unlike  Case  1,  ln#n  is 
underestimated  by  a  very  small  amount  in  Case  2  and  Case  3.  Overall,  this  data 
further  indicates  that  conditioning  with  more  frequent  concentration  measurements 
produces  only  slightly  better  NAPL  estimates. 


Table  3.3:  Summary  of  the  Prior  and  Posterior  statistics  for  synthetic  case  No.l 
(p  —  0.0)  conditioned  with  concentration  measurements  at  frequencies  0.3  days,  0.6 
days  and  0.9  days. 


Parameter 

Prior  statistics 
(Unconditional) 

Posterior 
(Conditional) 

0.3  days  x  7 

OS  days  x  4 

0.9  days  x  3 

Mean,  y  or  y 

-4.619 

-4.547 

-4.555 

-4.532 

oy  or  Oy 

0.864 

0.176 

0.190 

0.188 

Residual 

Mean 

-7.70  x  10~2 

-0.150 

-0.142 

-0.164 

Std. 

0.803 

0.763 

0.781 

0.800 

Normalized 
residual 

Mean 

-8.92  x  10"2 

-0.192 

-0.180 

-0.205 

Std. 

0.930 

0.973 

0.987 

1.000 

This  analysis  of  concentration  measurement  frequency  has  potentially  impor- 
tant implication  for  the  design  of  field  experiments.  If  the  model  incorporated  in  the 
Kalman  filter  is  assumed  to  describe  the  physical  system  perfectly,  and  if  the  measure- 
ments are  error-free  (i.e.,  there  is  no  system  noise,)  results  indicate  that  conditioning 
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Table  3.4:  Summary  of  the  Prior  and  Posterior  statistics  for  synthetic  case  No. 2 
(p  =  0.25)  conditioned  with  concentration  measurements  at  frequencies  0.3  days,  0.6 
days  and  0.9  days. 


Parameter 

Prior  statistics 
(Unconditional) 

Posterior 
(Conditional) 

0.3  days  x  7 

0.6  days  x  4 

0.9  days  x  3 



Mean,  y  or  y 

-4.619 

-4.457 

-4.483 

-4.488 

Oy    Or  Oy 

0.864 

0.303 

0.315 

0.323 

Residual 

Mean 

0.208 

4.562  x  10~2 

7.112  x  10-2 

7.605  x  10-2 

Std. 

0.762 

0.744 

0.749 

0.750 

Normaliz- 
ed residual 

Mean 

0.240 

3.913  x  10"2 

7.128  x  10"2 

7.766  x  l(r- 

Std. 

0.883 

0.939 

0.936 

0.929 

once  with  concentration  measurements  should  recover  most  of  the  information  avail- 
able regarding  NAPL  distribution.  This,  however,  is  rarely  the  case  in  the  real-world 
applications  since  measurements  are  inevitably  associated  with  measurement  error 
and  mathematical  models  often  involve  approximations  as  well.  Therefore,  repeated 
concentration  measurements  at  the  same  locations  may  remain  necessary  for  model 
testing  and  refinement  in  order  to  produce  reliable  estimates  of  the  aquifer  hydro- 
geochemical  parameters.  Furthermore,  repeated  conditioning  becomes  a  must  if  the 
goal  of  the  Kalman  filter  is  to  predict  the  site-specific  movement  of  the  solute  plume. 
This  is  because  that  repeated  conditioning  with  concentration  measurements  will  con- 
stantly force  the  algorithm  to  produce  a  concentration  distribution  that  is  close  to 
the  actual  history  of  plume  migration. 

In  order  to  examine  the  effects  of  the  spatial  configuration  of  the  sampling 
network  on  NAPL  estimation,  the  base  case  (p  =  0)  was  rerun  with  the  sampling 
network  shown  in  Figure  3.4b.  The  conditional  \n6n  and  the  conditional  standard 
deviation  ay  at  two  horizontal  layers  Z2  and  Z4  (0.327  m  and  0.762  m  above  the 
base,  respectively)  are  plotted  as  contours  lines  in  Figure  3.19.  The  statistics  of 
unconditional  and  conditional  In  9n  fields  are  summarized  in  the  first  column  of  Table 
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Table  3.5:  Summary  of  the  Prior  and  Posterior  statistics  for  synthetic  case  No. 3 
(p  =  -0.25)  conditioned  with  concentration  measurements  at  frequencies  0.3  days, 
0.6  days  and  0.9  days. 


Parameter 

Prior  statistics 
(Unconditional) 

Posterior 
(Conditional) 

0.3  days  x  7 

0.6  days  x  4 

0.9  days  x  3 

Mean,  y  or  y 

-4.619 

-4.48 

-4.50 

-4.50 

oy  or  Oy 

0.864 

0.304 

0.306 

0.302 

Residual 

Mean 

0.209 

7.074  x  10"2 

8.799  x  10~2 

8.748  x  10"2 

Std. 

0.792 

0.747 

0.752 

0.754 

Normaliz- 
ed residual 

Mean 

0.242 

7.342  x  10"2 

9.563  x  10"2 

9.500  x  10"2 

Std. 

0.917 

0.949 

0.943 

0.937 

3.6.  Both  Figure  3.19  and  Table  3.6  indicate  that  with  same  number  of  measurements 
used  to  condition  at  the  same  measuring  times,  the  accuracy  of  estimated  NAPL  field 
does  not  differ  much  from  that  in  the  base  case.  Both  sampling  network  designs  appear 
to  have  approximately  captured  dominant  features  of  the  "true"  NAPL  distribution 
in  some  of  the  upstream  area,  but  missed  major  features  in  downstream  or  near- 
boundary  regions.  Note  that  the  conditional  standard  deviations  of  ln#n  (i.e.,  ay  for 
sampling  network  shown  in  Figure  3.4b  are  not  dramatically  different  from  the  base 
case  values. 

Synthetic  case  No.  4  assumes  that  random  fields  ln#„  and  In  if  are  perfectly 
negatively  correlated,  and  the  conditioning  is  based  on  60  measurements  repeatedly 
made  seven  times.  Estimates  of  (a)  \n9n,  (b)  conditional  ay,  (c)  qxy  and  (d)  qI2 
at  a  horizontal  layer  Z4  are  shown  in  Figure  3.20.  Statistics  of  the  (normalized) 
estimation  residual  of  ln#„  are  listed  in  the  second  column  of  Table  3.6.  It  appears 
the  perfectly  negative  correlation  assumption  leads  to  slight  underestimation  of  ln#„. 
Also  note  that  the  uncertainties  associated  with  NAPL  estimates  are  slightly  lower 
compared  to  the  base  case,  which  is  a  reflection  of  stronger  correlation  of  /?cy(x,  x') 
between  concentration  measurement  made  at  x  and  NAPL  estimate  at  x'.  This 
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Table  3.6:  Summary  of  the  Prior  and  Posterior  statistics  for  synthetic  case  No.l 
(p  =  0,  sampling  network  shown  in  Figure  3.4b),  synthetic  case  No. 4  (p  =  —1.0, 
perfectly  negatively  correlated  \nK  and  ln#n  fields),  and  synthetic  case  No.5  (p  =  0, 
use  420  concentration  measurements  at  time=0.5  and  1.0  days). 


Parameter 

Case  No.l 

Case  No. 4 

Case  No.5 

Sampling  network 

Figure  3.4b 

Figure  3.4a 

Figure  3.4c 

Conditioning  times 

0.3days  x  7 

O.Mays  x  7 

0.5  days 

Mean  y 

-4.619 

-4.619 

-4.619 

Oy 

0.864 

0.864 

0.864 

Prior  statistics 

Residual 

Mean 

-7.70  x  10"2 

2.35  x  10~3 

-7.70  x  10"2 

Unconditional 

Std. 

0.803 

0.817 

0.803 

Normalized 

Mean 

-8.92  x  10"2 

2.73  x  10"3 

8.92  x  10"2 

residual 

Std. 

0.930 

0.946 

0.930 

Mean  y 

-4.508 

-4.64 

-4.38 

0.188 

0.204 

0.304 

Posterior 

Residual 

Mean 

-0.188 

1.95  x  10"2 

-0.243 

Conditional 

Std. 

0.771 

0.822 

0.753 

Normalized 

Mean 

-0.238 

4.82  x  10"2 

-0.286 

residual 

Std. 

0.959 

1.18 

0.947 

result  is  recalled  to  have  been  discovered  in  the  unconditional  moment  analyses.  The 
NAPL  prediction  uncertainty  (measured  by  ay)  is  substantially  lower  than  the  base 
case,  which  is  expected  because  perfectly  negative  correlation  between  \n6n  and  In  if 
generates  the  strongest  correlation  between  concentration  measurement  and  NAPL 
(Chapter  2.) 

Case  5  examines  the  snap-shot  sampling  scenario  (Figure  3.4c)  in  which  420 
concentration  measurements  are  taken  at  time=0.5  days.  Figure  3.21  shows  the  esti- 
mated In  6n  (contour  lines)  superimposed  on  the  originally  simulated  NAPL  random 
field  (color  image)  at  horizontal  layer  Z2,  Z3,  Z4,  Z5,  and  Z6,  which  are  0.327,  0.544, 
0.762,  0.980,  and  1.197  m  above  the  base  of  the  domain,  respectively.  Similarly,  the 
conditional  standard  deviation  dy  is  shown  in  Figure  3.22.  Statistical  analysis  on 
(normalized)  NAPL  estimation  error  is  summarized  in  the  third  column  of  Table  3.6. 
Figure  3.21  clearly  shows  that  the  estimated  ln0n  field  approaches  the  "true"  NAPL 
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field  in  a  much  more  detailed  fashion  than  the  base  case.  High  NAPL  areas  located 
near  the  boundaries  tend  to  be  recovered  by  the  algorithm.  This  case  is  directly 
comparable  to  the  base  case  in  terms  of  sampling  effort  because  both  cases  are  based 
on  a  total  number  of  420  concentration  measurements.  These  results  lead  to  the  con- 
clusion that  for  the  purpose  of  estimating  aquifer  parameters,  measurements  made  in 
a  snap-shot  fashion  at  one  or  a  few  times  seem  to  be  more  efficient  than  frequently 
measuring  concentration  in  a  coarse  sampling  system.  Statistical  analysis  on  (normal- 
ized) estimation  error  indicates  an  overestimation  of  residual  NAPL  content  as  also 
happened  in  the  base  case.  The  spatial  average  of  the  normalized  residual  has  a  value 
of  —0.286,  showing  that  \n6n  estimates  are  slightly  more  biased  than  that  of  —0.192 
encountered  in  the  base  case.  Cumulative  density  functions  for  both  conditional  and 
unconditional  normalized  residual  of  ln#n  are  presented  in  Figure  3.23. 

3.4  Summary 

Using  a  stochastic  perturbation  method  and  the  optimal  estimation  theory,  a 
three-dimensional,  distributed,  parameter  extended  Kalman  filter  was  developed  to 
estimate  spatially  distributed  NAPL  residual  content  as  well  as  Darcy  flux.  The  ob- 
jective of  this  algorithm  is  to  generate  informative  maps  of  aquifer  hydrogeochemical 
properties  for  characterization  of  remediation  sites  and  quantify  the  accuracy  of  the 
estimated  parameters.  Another  potential  application  for  the  filtering  algorithm  is  to 
predict  site-specific  movement  of  solute  plume  in  heterogeneous  underground  environ- 
ment. Such  predictions  and  estimations  are  useful  in  monitoring  contaminant  plume 
and  may  lead  to  more  efficient  design  of  site  monitoring  and  remediation  networks. 

The  Kalman  filter  is  developed  from  a  physically  based  mathematic  model 
which  describes  the  major  processes  of  tracer  transport  by  groundwater,  ,  i.e.,  advec- 
tion,dispersion,  and  linear  sorption  for  describing  movement  of  a  partitioning  tracer 
between  the  water-phase  and  the  NAPL-phase.  The  moment  propagation  equations 
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are  first-order  approximations  that  rely  on  the  small  perturbation  assumption.  The 
conditioning  component  of  the  algorithm  requires  point  measurement  of  a  nonparti- 
tioning  tracer,  which  is  used  to  update  the  Darcy  flux  field,  and  a  partitioning  tracer, 
which  is  used  to  update  only  the  NAPL  field. 

Demonstration  of  the  algorithm  with  synthetic  data  sets  confirms  that  the  al- 
gorithm successfully  captures  dominant  features  of  the  NAPL  and  flux  distributions 
and  accurately  predicts  the  estimation  error.  After  conditioning  using  concentration 
measurements,  the  estimated  NAPL  and  Darcy  flux  distributions  are  generally  associ- 
ated with  reduced  uncertainty  compared  to  those  without  conditioning.  Information 
drawn  from  concentration  measurements  by  the  filtering  algorithm  generally  leads 
to  the  recovery  of  improved  parameters  primarily  in  the  upstream  areas  of  the  sam- 
pling locations.  Parameters  estimated  downstream  of  the  last  row  of  sampling  points 
approach  their  unconditional  values  and  the  conditional  standard  deviations  (or  un- 
certainties) also  approach  the  unconditional  standard  deviation.  With  the  sampling 
network  schemes  shown  in  Figures  3.4a  and  3.4b,  parameters  near  boundaries  are  not 
accurately  estimated  by  the  algorithm. 

The  conditional  simulation  results  lead  to  the  conclusion  that  measuring  con- 
centration once  should  provide  sufficient  information  for  the  filtering  algorithm  to 
estimate  NAPL  and  flux.  As  already  suggested  in  the  unconditional  study,  repeated 
concentration  measurements  made  at  same  locations  do  not  seem  to  provide  sig- 
nificant additional  information.  A  more  efficient  sampling  scheme  seems  to  be  the 
snap-shot  approach  which  measures  tracer  concentration  at  large  number  of  spatial 
locations  once.  Synthetic  simulation  indicates  parameter  estimation  based  on  a  snap- 
shot sampling  scheme  tends  to  exhibit  more  details  in  the  estimated  distribution  of 
aquifer  parameters.  However,  it  should  be  understood  that  the  above  conclusions  only 
apply  to  the  situation  in  which  the  physical  system  is  represented  perfectly  by  the 
mathematic  model  and  measurements  are  error-free.  Realizing  that  both  conditions 
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are  rarely  satisfied  in  real- world  applications,  frequent  sampling  may  be  required,  par- 
ticularly to  evaluate  model  accuracy.  Frequent  conditioning  will  always  be  required 
if  the  goal  of  algorithm  is  to  accurately  estimate  site-specific  tracer  transport. 

Although  the  algorithm  tends  to  slightly  overestimate  or  underestimate  the 
In  NAPL  field  depending  on  the  assumption  made  on  the  correlation  between  NAPL 
and  hydraulic  conductivity  field,  the  magnitude  of  the  overestimation  (or  underesti- 
mation) is  not  significant  and  does  not  appear  to  be  sensitive  to  the  degree  of  the 
correlateness  between  the  two  random  fields.  The  case  assuming  perfectly  negative 
correlation  between  NAPL  and  conductivity  does  indicate  a  slightly  more  reduced 
uncertainty  in  NAPL  estimation  compared  to  the  uncorrelated  cases,  however,  such 
a  reduction  is  not  as  significant  as  that  suggested  by  the  unconditional  analyses  of 
concentration-NAPL  correlation  pC!/(x,  x').  This  is  possibly  attributed  to  the  addition 
of  measurement  error  that  at  some  spatial  locations  is  approximately  the  same  order 
of  magnitude  as  the  concentration  measurement  itself. 

The  algorithm  presented  here  incorporates  only  points  measurements  of  parti- 
tioning and  nonpartitioning  tracer  concentrations.  Point  measurements  of  hydraulic 
conductivity,  head,  and  NAPL  residual  content  can  also  be  used  to  improve  the 
estimate  if  they  are  available.  However,  this  will  require  significantly  more  com- 
putational effort  and  data  storage.  The  major  limitation  in  applying  this  model 
to  three-dimensional  applications  is  the  computational  effort  required  to  solve  the 
concentration  covariance  and  cross-covariance  equations.  The  increasingly  detailed 
description  of  the  random  fields  of  concentration  and  aquifer  parameters  which  is 
obtained  through  conditioning  requires  an  even  finer  discretization  which  determines 
the  size  of  the  state  covariance  matrix.  Nevertheless,  the  model  has  the  advantage 
of  consuming  a  reasonably  small  amount  of  CPU  time  over  a  conditional  Monte 
Carlo  simulation  [61,  62].  Future  development  of  more  efficient  solution  techniques 
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will  be  necessary  for  the  model  to  handle  more  sophisticated  and  large-scale  three- 
dimensional  applications. 
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Figure  3.4:  Horizontal  plan  view  of  the  spatial  placement  of  multilevel  samplers 
(MLSs).  (a)  12  MLSs  evenly  distributed;  (b)  12-MLS  configuration;  (c)  84-MLS 
snap-shot  configuration. 
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Figure  3.5:  Natural  log  hydraulic  conductivity  In  K  generated  by  a  turning-band 
algorithm  is  shown  at  horizontal  layers  Z2,  Z3,  Z4,  and  Z5  (0.327,  0.544,  0.762,  and 
0.980  m  above  the  base,  respectively). 
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Figure  3.6:  Natural  log  hydraulic  conductivity  In  K  generated  by  a  turning-band 
algorithm  is  shown  at  vertical  slices  Y2,  Y4,  and  Y6  (0.762,  1.778,  and  2.794  m  from 
the  origin  in  y-direction,  respectively.) 
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Figure  3.7:  Vector  plot  of  x-y  component  (qIy)  of  the  simulated  Darcy  flux  at  hor- 
izontal layers  Z2,  Z3,  Z4,  and  Z5  (0.327,  0.544,  0.762,  and  0.980  m  above  the  base, 
respectively) . 
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Figure  3.8:  Vector  plot  of  x-z  component  (qX2)  of  the  simulated  Darcy  flux  at  vertical 
slices  Y2,  Z4,  and  Y6  (0.762,  1.778,  and  2.794  m  from  the  origin  in  y-direction, 
respectively). 
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Figure  3.9:  Vector  plot  of  x-y  component  (qxy)  of  estimated  Darcy  flux  at  horizontal 
layers  Z2,  Z3,  Z4,  and  Z5  (0.327,  0.544,  0.762,  and  0.980  m  above  the  base,  respec- 
tively). Synthetic  case  No.l:  p  =  0. 
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Figure  3.10:  Vector  plot  of  x-z  component  (qI2)  of  estimated  Darcy  flux  at  vertical 
slices  Y2,  Z4,  and  Y6  (0.762,  1.778,  and  2.794  m  from  the  origin  in  y-direction, 
respectively).  Synthetic  case  No.l:  p  =  0. 


124 


1 .0  days  1 .5  days 


Figure  3.11:  Conditional  and  unconditional  mean  concentrations  are  shown  as  solid 
and  dashed  contour  lines,  respectively,  superimposed  on  the  synthetically  simulated 
concentration  fields  (in  color).  Shown  above  is  horizontal  layer  Z4  (0.762  m  above 
the  base)  at  times  0.5,  1.0,  and  1.5  days.  Synthetic  case  No.l:  p  =  0. 


Figure  3.12:  Conditional  concentration  standard  deviation  is  plotted  as  solid  contour 
lines  at  horizontal  layer  Z4  (0.762  m  above  the  base)  at  times  0.5,  1.0,  and  1.5  days. 
Synthetic  case  No.l:  p  =  0. 
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Figure  3.13:  Estimation  of  \n9n  is  shown  as  solid  contour  lines  superimposed  on 
the  originally  simulated  NAPL  random  field  (in  color).  Shown  above  are  four  hori- 
zontal layers  Z2,  Z3,  Z4,  and  Z5  (0.327,  0.544,  0.762,  and  0.980  m  above  the  base, 
respectively).  Synthetic  case  No.l:  p  =  0. 
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Figure  3.14:  Estimation  of  ln#n  is  shown  as  solid  contour  lines  superimposed  on  the 
originally  simulated  NAPL  random  field  (in  color).  Shown  above  are  three  vertical 
slices  Y2,  Y4,  and  Y6  (0.762,  1.778,  and  2.794  m  from  the  origin  in  y-direction, 
respectively).  Synthetic  case  No.l:  p  =  0. 
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Figure  3.15:  Conditional  standard  deviation  of  ln#n  is  shown  as  solid  contour  lines 
at  four  horizontal  layers  Z2,  Z3,  Z4,  and  Z5  (0.327,  0.544,  0.762,  and  0.980  m  above 
the  base,  respectively).  Synthetic  case  No.l:  p  =  0. 
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Figure  3.16:  Conditional  standard  deviation  of  ln#„  is  shown  as  solid  contour  lines 
at  three  vertical  slices  Y2,  Y4,  and  Y6  (0.762,  1.778,  and  2.794  m  from  the  origin  in 
y-direction,  respectively).  Synthetic  case  No.l:  p  =  0. 
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Figure  3.17:  Cumulative  distribution  function  of  prior  and  post  normalized  residual 
of  ln#n  in  comparison  to  a  unit  Gaussian  distribution.  Synthetic  case  No.l:  p  =  0. 


Figure  3.18:  Conditional  correlation  pcy(x,  x')  with  respect  to  reference  point  x  (2.159 
m,  1.778  m,  0.762  m)  at  horizontal  layer  Z4  (0.762  m  above  the  base)  is  plotted  prior 
to  and  after  measurement  times  0.5  days  and  1.0  days. 
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Figure  3.19:  Estimation  of  ln6n  is  shown  (on  the  left)  as  solid  contour  lines  super- 
imposed on  the  originally  simulated  NAPL  random  field  (in  color),  on  the  right  is 
the  contour  plot  of  the  corresponding  conditional  standard  deviation  of  In  6n.  Shown 
above  are  four  horizontal  layers  Z2  and  Z4  (0.327  m,  and  0.762  m  above  the  base, 
respectively).  Synthetic  case  No.l:  p  =  0  with  spatial  sampling  network  shown  in 
Figure  3.4b. 
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Figure  3.20:  At  horizontal  later  Z4  (0.762  m  above  the  base):  (a)  Estimation  of 
In  0n  (contour  lines)  superimposed  on  the  originally  simulated  NAPL  random  field 
(in  color);  (b)  Conditional  standard  deviation  of  ln#n  (contour  lines);  and  (c)  Vector 
plot  of  x-y  component  (q^y)  of  estimated  Darcy  flux,  (d)  At  vertical  slice  Y4  (1.778 
m  from  the  origin  in  y-direction),  vector  plot  of  x-z  component  (qI2)  of  estimated 
Darcy  flux.  Synthetic  case  No. 4:  p  =  —1.0. 
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Figure  3.21:  Estimation  of  ln#„  is  shown  as  solid  contour  lines  superimposed  on  the 
originally  simulated  NAPL  random  field  (in  color).  Shown  above  are  four  horizontal 
layers  Z2,  Z3,  Z4,  Z5  and  Z6  (0.327,  0.544,  0.762,  0.980,  and  1.197  m  above  the  base, 
respectively).  Synthetic  case  No. 5:  p  =  0,  conditioning  on  420  measurements  at  time 
0.5  days. 
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Figure  3.22:  Conditional  standard  deviation  of  In  6n  is  shown  as  solid  contour  lines 
at  four  horizontal  layers  Z2,  Z3,  Z4,  Z5  and  Z6  (0.327,  0.544,  0.762,  0.980,  and  1.197 
m  above  the  base,  respectively).  Synthetic  case  No. 5:  p  =  0,  conditioning  on  420 
measurements  at  time  0.5  days. 
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Figure  3.23:  Cumulative  distribution  function  of  prior  and  post  normalized  residual 
of  ln#„  in  comparison  to  a  unit  Gaussian  distribution.  Synthetic  case  No. 5:  p  =  0, 
conditioning  on  420  measurements  at  time  0.5  days. 


CHAPTER  4 
ANALYSIS  OF  FIELD  DATA 

4.1  Introduction 

Groundwater  contamination  from  improper  disposal  of  hazardous  industrial 
wastes  has  drawn  increasing  public  concern  that  demands  scientific  investigation  of 
mathematic  models  capable  of  representing  field  conditions.  In  order  for  the  output 
of  a  mathematic  model  to  be  site-specific,  it  is  desirable  that  these  models  make 
best  use  of  all  available  sources  of  information.  Bayesian  estimation  theory  provides 
a  convenient  means  to  integrate  model  predictions,  which  are  based  on  the  mathe- 
matical description  of  physical  and  chemical  processes,  and  field  data.  The  theory 
involves  constant  updating  or  conditioning  of  model  predictions  on  field  measurements 
whenever  they  become  available.  In  the  last  two  chapters  (Chapter  3  and  Chapter 
4),  a  stochastic  model  based  on  Bayesian  estimation  theory  has  been  developed  to 
(1)  estimate  aquifer  hydrogeochemical  parameters,  and  (2)  predict  movement  of  so- 
lute plume  based  on  solute  concentration  measurements.  The  approach  encompasses 
(1)  simultaneously  solving  a  system  of  coupled  stochastic  partial  differential  equations 
(derived  from  the  traditional  advection-dispersion-retardation  equation  using  stochas- 
tic perturbation  techniques)  to  simulate  the  temporal  evolution  of  the  first  and  second 
ensemble  moments  of  the  variables  of  interest  between  measurement  times,  and  (2) 
conditioning  these  moments  on  spatially  scattered  measurements  of  concentration 
at  discrete  times.  The  model  was  applied  to  several  synthetic  cases  in  which  both 
synthetically-generated  reactive  and  nonreactive  solute  concentrations  were  used  to 
optimally  estimate  the  distributions  of  hydrogeochemical  parameters,  such  as  Darcy 
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flux  and  NAPL  residual  content  in  a  three-dimensional  saturated  aquifer  contami- 
nated by  nonaqueous  phase  liquids  (NAPL). 

In  this  chapter,  the  stochastic  model  is  applied  to  a  field  tracer  experiment 
conducted  by  a  group  of  researchers  at  the  University  of  Florida  (UF)  in  a  study  of 
investigating  NAPL-site  remediation  using  cosolvent  flushing  techniques  [5].  Annable 
et  al.  [3]  conducted  interwell  partitioning  tracers  (IWPT)  tests  in  an  hydraulically- 
isolated  test  cell  (3.5  m  x  4.3  m)  with  light  NAPL  trapped  within  a  1.5-meter  smear 
zone  in  a  shallow,  unconfined  sand  and  gravel  aquifer.  The  purpose  of  the  IWPT 
was  to  quantify  the  NAPL  present  based  on  the  measured  retardation  of  a  partition- 
ing tracer  (2,2-dimethyl-3-pentanol)  contrasted  to  a  non-reactive  tracer  (bromide). 
Concentration  measurements  of  tracers  were  collected  from  12  multi-level  samplers 
(MLS)  equally  spaced  within  the  test  cell.  The  objective  of  the  current  study  is  to 
characterize  the  spatial  distributions  of  residual  NAPL  content  and  the  Darcy  flux 
for  the  test  cell  using  the  optimal  estimation  inverse  method. 

Quantifying  the  spatial  distributions  of  aquifer  hydrogeochemical  parameters 
at  hazardous  waste  sites  is  of  great  importance  for  determining  environmental  im- 
pacts, and  for  designing  a  remediation  strategy.  NAPL  contaminants  include  a  wide 
range  of  industrial  compounds  such  as  gasoline,  fuel  oils,  chlorinated  and  fluorinated 
hydrocarbons,  creosote,  and  oils  [88].  Both  light  and  dense  NAPLs,  typically  residing 
in  the  vadose  zone  and  saturated  zone  respectively,  contribute  to  extensive  groundwa- 
ter contamination  and  pose  a  long-term  threat  to  groundwater  quality  because  they 
exhibit:  (1)  low  liquid  viscosities  (are  able  to  move  easily  into  the  subsurface);  (2)  low 
interfacial  tensions  with  water  (are  able  to  enter  into  water- wet  fractures  relatively 
easily);  (3)  high  volatilities  (are  able  as  gases  to  diffuse  rapidly  downwards  into  the 
unsaturated  zone);  (4)  low  absolute  solubilities  (are  difficult  to  remove  from  the  sat- 
urated zone);  (5)  high  solubilities  relative  to  drinking  water  limits  (are  able  to  cause 
significant  health  risks  even  when  small  amounts  dissolve);  (6)  low  partitioning  to 
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soils  (are  not  retarded  by  aquifer  materials);  (7)  low  degradabilities  [94].  Character- 
ization of  spatially-distributed  NAPL  in  its  residual  saturation  phase  is  a  complex 
problem,  yet  often  a  prerequisite  to  remediation. 

The  use  of  interwell  partitioning  tracer  (IWPT)  tests,  in  which  tracers  that 
partition  into  the  NAPL  phase  are  displaced  through  the  aquifer,  is  an  attractive 
alternative  to  the  traditional  intrusive  techniques  which  may  require  extensive  soil 
coring  and  analysis  [5].  The  theory  behind  partitioning  tracers  for  quantification  of 
NAPL  saturation  is  based  on  a  simplified  linear  relationship  that  characterizes  the 
mass  distribution  of  tracer  between  the  aqueous  and  NAPL  phases.  The  effect  of 
phase  partitioning  causes  a  delayed  or  retarded  transport  rate  for  the  partitioning 
tracer.  Therefore,  by  studying  the  separation  between  the  breakthrough  curves  of 
a  pair  of  tracers  with  different  partitioning  coefficients,  the  unknown  NAPL  volume 
(or  residual  saturation)  in  the  region  swept  by  tracers  can  be  determined.  Detailed 
discussion  of  the  theory  and  laboratory  experiments  are  summarized  in  the  works  by 
Jin  et  al.  [72],  and  Whitley  et  al.  [132].  Annable  et  al.  [3]  applied  Jin's  method 
for  the  UF  tracer  experiments  and  calculated  the  total  volume  of  residual  NAPL  in 
the  test  cell.  Using  tracer  breakthrough  data  from  three  extraction  wells  (EWs), 
Annable  et  al.  [3]  found  that  the  residual  NAPL  saturation  (Sn)  averaged  over  the 
entire  cell  was  approximately  0.046.  Their  results  also  show  that  the  magnitude  of 
NAPL  residual  saturation  (Sn)  varies  from  0.031  for  the  volume  of  aquifer  swept  by 
EW1,  to  0.088  for  the  volume  of  aquifer  swept  by  EW3.  This  is  an  indication  of  the 
existence  of  spatial  heterogeneity  in  both  NAPL  and  aquifer  hydraulic  properties. 

Direct  use  of  Jin's  method  [72]  will  only  yield  a  volume-averaged  value  for 
residual  NAPL  content  over  the  region  swept  by  tracers.  However,  further  knowledge 
of  the  distribution  of  NAPL  is  often  critical  to  the  design  of  a  good  remediation 
strategy  in  field  applications.  The  stochastic  model  developed  in  this  work  has  the 
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advantage  that  it  produces  site-specific,  optimal  estimates  of  both  NAPL  and  Darcy 
flux  in  a  spatially  distributed  manner. 

In  the  stochastic  approach,  the  unknown  heterogeneities  of  residual  NAPL 
content  and  Darcy  flux  are  treated  as  random  fields  that  can  be  described  with  a 
few  statistical  parameters.  The  variability  in  Darcy  flux  is  understood  to  be  a  result 
of  the  variability  in  hydraulic  conductivity  and  thus  the  sole  source  of  concentration 
variability  for  the  nonpartitioning  tracer.  The  random  conductivity,  together  with 
NAPL  variability,  are  the  sources  of  concentration  variability  for  the  partitioning 
tracer.  Thus  the  nonpartitioning  tracer  measurements  may  be  used  to  estimate  the 
distribution  of  Darcy  flux,  and  the  partitioning  tracer  measurements  may  be  used  to 
estimate  both  the  distributions  of  NAPL  and  Darcy  flux. 

The  stochastic  model  is  a  distributed  parameter  extended  Kalman  filter  which 
consists  of  a  moment  propagating  algorithm  and  a  moment  update  algorithm.  The 
moment  propagating  algorithm  consists  of  a  set  of  coupled  partial  differential  equa- 
tions which  describe  how  the  first  and  second  moments  of  the  state  variables  propa- 
gate through  time  via  the  processes  of  advection,  dispersion,  retardation,  as  well  as 
macrodispersion  caused  by  hydrogeochemical  heterogeneities.  The  first  moments  for 
the  state  variables  represent  the  expected  values  (or  means)  of  the  random  variables 
over  an  ensemble.  The  expected  variation  around  this  mean  is  represented  by  the  sec- 
ond moments.  Unconditional  moments  (that  do  not  depend  on  field  measurements) 
can  be  made  site-specific  (or  conditional)  through  the  moment  updating  algorithm. 
The  conditioning  part  of  the  algorithm  performs  a  function  similar  to  the  algorithm 
of  cokriging  with  known  mean,  which  is  commonly  seen  in  many  geostatistical  appli- 
cations. 

To  initialize  the  algorithm,  all  of  the  parameter  fields  are  assumed  to  be 
stationary.  Thus  the  resulting  unconditional  plumes  and  streamlines  are  regularly 
shaped.   As  the  conditioning  algorithm  proceeds,  the  simulated  conditional  mean 
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plumes,  streamlines,  and  NAPL  field  are  reshaped  into  a  more  realistic,  irregular 
configuration.  These  conditional  means  are  the  best,  unbiased,  linear  estimates  of 
the  actual  random  fields,  based  on  all  of  the  information  available.  In  the  application 
discussed  here,  concentration  measurements  of  the  nonpartitioning  tracer  bromide 
are  used  to  solely  condition  the  Darcy  flux  field,  while  the  measured  concentration 
data  of  the  partitioning  tracer  2,2-dimethyl-3-pentanol  (DMP)  is  used  to  condition 
the  NAPL  field  (see  Chapter  3  for  details.) 

In  the  following  section,  we  will  demonstrate  how  this  stochastic  algorithm 
can  be  applied  to  the  UF  field  tracer  tests  in  order  to  obtain  the  best  estimates  of 
the  three-dimensional  Darcy  flux  and  NAPL  residual  saturation  distributions  in  the 
test  cell.  Qualitative  comparison  of  the  estimated  NAPL  residual  saturation  to  that 
obtained  via  soil  coring  and  Jin's  [72]  temporal  moment  methods  will  be  made.  The 
results  of  the  algorithm  should  be  useful  for  simulation  and  optimization  of  in-situ 
remediation  experiments. 

4.2    QU-1  Site  Background  and  Installation 

The  site  selected  for  construction  of  the  test  cell  and  cosolvent  flushing  evalu- 
ation is  located  within  Operable  Unit  l(OU-l)  at  Hill  Air  Force  Base,  approximately 
30  miles  North  of  Salt  Lake  City,  Utah.  The  base  is  located  west  of  the  Wasatch 
Mountain  Range  on  a  delta  terrace  formed  by  the  Weber  River.  In  the  area  where 
the  test  cell  was  constructed,  the  geology  consists  of  a  shallow  sand  and  gravel  surfi- 
cial  aquifer  underlain  by  a  thick  clay  aquitard.  The  sand  and  gravel  is  approximately 
6.1  m  thick  and  the  water  table  was  located  at  5.8  m  below  ground  surface  (BGS)  at 
the  time  of  cell  installation.  The  clay  aquitard  is  encountered  at  approximately  6.1  m 
BGS  and  extends  to  depths  of  120  m  or  more.  The  clay  unit  is  the  Alpine  Formation 
and  the  overlying  sand  and  gravel  comprises  the  Provo  Formation.  In  the  area  of  the 


142 


test  cell,  the  clay  unit  is  suspected  to  have  erosional  features  at  the  sand  and  gravel 
contact,  and  has  numerous  thin  sand  stringers  present  [4]. 

There  are  a  number  of  sources  across  the  OU-1  site  which  may  have  con- 
tributed to  the  light  nonaqueous  phase  liquid  (LNAPL)  plume  in  which  the  test 
cell  was  located.  The  primary  source  of  NAPL  is  from  two  Chemical  Disposal  Pits 
(CDPs)  located  hydraulically  up-gradient  of  the  test  cell  location.  These  CDPs  were 
used  during  the  1940s  and  1960s  to  dispose  of  aviation  fuels  (JP4)  and  chlorinated 
solvents.  Another  potential  contribution  to  the  NAPL  plume  is  from  a  nearby  for- 
mer Fire  Training  Area  (FTA),  which  may  have  contributed  unextinguished  fuels  and 
combustion  byproducts  to  the  site.  The  resulting  complex  NAPL  is  lighter  than  wa- 
ter, exists  as  a  plume  covering  several  hectares,  and  is  evidenced  by  up  to  0.15  meters 
of  free  product  measured  in  wells.  Major  components  found  in  the  NAPL  complex 
include  1,2  DCB,  o-Xylene,  and  n-Decane.  Based  on  soil  boring  PID  readings  and 
soil  core  extraction  data,  the  NAPL  in  the  area  of  interest  is  located  in  a  smear  zone 
which  extends  from  about  1.0  meter  above  to  less  than  0.5  meters  below  the  existing 
water  table.  The  extent  of  the  NAPL  smear  zone  constitutes  the  target  remediation 
zone,  and  the  interwell  partitioning  tracer  tests  were  conducted  to  characterize  the 
smear  zone  hydrogeochemical  properties  [3,  4]. 

The  Waterloo  Sheet  Pile  design  [118,  119]  was  used  to  construct  the  test  cell. 
The  sheet  pile  walls  extend  to  a  depth  of  9.1  meters,  or  about  3  meters  into  the  clay 
layer.  The  joints  connecting  the  sheet  piles  were  flushed  with  water  and  filled  with  a 
custom-designed  bentonite/cement/polymer  mixture  to  ensure  hydraulic  isolation. 

A  plan  view  of  the  test  cell  is  depicted  in  Figure  4.1.  The  test  cell  was  in- 
strumented with  four  injection  wells  (IWs),  three  extraction  wells  (EWs),  and  12 
multilevel  samplers  (MLS).  The  IWs  and  EWs  were  located  0.23  meters  from  the  cell 
walls  and  were  evenly  spaced  along  the  ends  of  the  cell  placing  the  wells  at  the  center 
of  the  fraction  of  the  total  cell  flow  supplied  by  the  well.  Each  multilevel  sampler 
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Figure  4.1:  Plan  view  of  OU1  test  cell.  Area  within  the  dashed  lines  represents  the 
simulation  domain. 
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may  generate  5  samples  from  5  different  vertical  locations  uniformly-spaced  from  4.57 
meters  BGS  to  6.10  meters  BGS  at  an  interval  of  0.38  meters,  color-coded  by  Black, 
Blue,  Red,  White,  and  Yellow,  respectively  (Figure  4.2).  Some  of  the  MLSs  had  extra 
deep  sampling  points  penetrating  0.4  meters  into  the  clay  confining  unit  to  investigate 
the  advancement  of  tracers  (or  cosolvents)  into  the  aquitard.  The  MLSs  were  located 
in  three  rows  (in  line  with  the  extraction  wells)  of  four  samplers  that  were  equally 
spaced  between  the  injection  and  extraction  wells  (Figure  4.1)  [4].  Installation  of 
multilevel  samplers  were  mainly  by  the  methods  of  Cone  Penetrometer  (CPT)  and 
hollow-stem  auger,  with  special  consideration  paid  to  minimize  disturbance  to  the 
hydraulic  structure  of  the  test  cell.  Three  monitoring  wells  (not  shown  in  Figure  4.1) 
were  also  installed  along  a  central  longitudinal  transect  for  core  sample  collection  and 
water  level  monitoring  purposes. 


Following  the  instrumentation,  the  cell  was  hydraulically  tested  by  raising  the 
water  level  in  the  cell  from  approximately  5.2  meters  BGS  to  4.6  meters  BGS.  Based 
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on  the  measured  water  level  rise  induced  by  injection  of  1,740  L  of  water  at  a  flow  rate, 
Q,  of  5.7  L/min,  the  water  content  9W  was  estimated  to  be  0.20.  The  dynamic  response 
of  the  cell  was  analyzed  and  a  cell-wide  averaged,  saturated  hydraulic  conductivity 
of  15  m/day  was  estimated. 

4.3    OU-1  Tracer  Tests 

Partitioning  tracer  selection  for  the  OU-1  tracer  test  was  mainly  based  on  the 
value  of  partitioning  coefficient  between  the  aqueous  and  NAPL  phases,  K^,  which 
was  measured  in  batch  and  column  tests  conducted  using  NAPL  from  the  OU-1  test 
site.  Desirable  tracer  characteristics  include:  (l)non-hazardous;  (2)non-toxic;  ^non- 
degrading;  (4)reasonable  cost  and  availability;  and  (5)easily  quantifiable,  especially 
in  the  presence  of  multiple  NAPL  constituents  [3].  Tracers  selected  for  the  OU-1 
pre-  cosolvent  flushing  experiment  were:  bromide  (nonpartitioning),  and  ethanol,  n- 
pentanol,n-hexanol,  and  2,2-dimenthyl-3-pentanol  (DMP)  in  order  of  increasing  KN 
value  (see  Table  4.1.)  The  larger  the  KN  value,  the  greater  the  separation  between 
the  partitioning  tracer  and  the  nonpartitioning  tracer. 

The  interwell  partitioning  tracer  (IWPT)  test  was  conducted  at  the  OU-1  test 
cell  over  an  eight-day  period  in  October  1994.  The  array  of  tracers  was  introduced 
over  a  3.3-hour  period  as  an  approximate  0.1  pore  volume  pulse  after  steady-state  flow 
had  been  established.  Tracer  concentrations  at  extraction  wells  (E Ws)  and  multilevel 
samplers  (MLSs)  were  measured  at  frequent  intervals.  Bromide  and  2,2-dimethyl-3- 
pentanol  concentration  data  were  used  in  this  study  because  they  exhibited  the  best 
mass  recovery (102%  and  92%  respectively)  and  the  greatest  degree  of  separation. 

Steady  flow  with  the  water  table  level  at  4.57  ±  0.003  m  BGS  was  maintained 
by  continuously  measuring  and  adjusting  the  pumping  rate.  The  average  flow  rate 
during  the  tracer  experiment  was  3.2  L/min.,  equivalent  to  an  average  Darcy  velocity 
of  about  0.003  m/hr.,  or  an  average  hydraulic  residence  time  in  the  test  cell  of  about 


146 


Table  4.1:  Tracers  used  in  Interwell  Partitioning  Tracer  Test  (IWPT)  at  OU-1  test 
cell,  including  partitioning  coefficients  (Kn)  and  injected  tracer  concentration. 


Tracers 

KN 

c0  (mg/L) 

Bromide 

0.0 

273 

Ethanol 

0.1 

1330 

n-Pentanol 

1.4 

989 

n-Hexanol 

4.6 

945 

2,2-Dimethyl-3-Pentanol 

12.9 

878 

1  day  [3].  Based  on  the  measured  hydraulic  gradient  of  0.05  within  the  cell  during  the 
tracer  test,  an  average  saturated  hydraulic  conductivity  of  17  m/day  was  calculated, 
which  compares  well  with  the  estimate  of  15  m/day  obtained  during  the  rising  water 
table  test. 

4.4    Problem  Formulation 
4.4.1    Description  of  the  state  and  measurement 

The  focus  of  this  study  is  on  the  NAPL-smear  zone  within  the  test  cell.  The 
test  cell  was  assumed  to  be  a  rectangular  saturated  box,  and  water  table  fluctuations 
due  to  pumping  fluctuations  were  neglected  and  steady-state  flow  was  assumed.  To 
further  simply  simplify  the  problem,  the  effects  of  the  capillary  fringe  and  the  in- 
clination of  the  clay  bed  were  not  considered.  Concentration  measurements  are  all 
assumed  to  be  made  within  the  saturated  domain,  even  though  some  of  the  Black 
ports  may  be  located  in  the  capillary  fringe  and  some  of  the  yellow  ports  may  be 
located  in  the  clay  bed  in  the  field  tracer  experiment.  Measurements  at  some  Black 
and  Yellow  ports  were  occasionally  unavailable  because  of  practical  difficulties. 

Figure  4.1  shows  the  plan  view  of  the  test  cell  and  the  relative  locations  of  the 
wells  and  MLSs.  Area  within  the  dashed  lines  indicates  the  domain  for  the  stochastic 
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model.  For  the  finite-difference  solution  purposes,  the  three-dimensional  simulation 
domain  D,  with  its  boundary  denoted  by  <9D,  is  uniformly  divided  into  14  x  9  x  7 
blocks  (with  the  block  size  selected  as  0.254  x  0.389  x  0.218  m3,)  over  each  of  which 
the  variables  of  interest  are  defined  as  volumetrically  averaged  quantities.  Due  to 
the  computational  burden  involved  with  solving  non-stationary  velocity  covariances 
which  result  near  wells  both  injection  and  extraction  wells  are  excluded  from  the 
simulation.  Instead,  a  constant  flux  boundary  condition  is  assumed  at  the  inlet  (left) 
and  outlet  (right)  side  of  the  domain,  with  the  flow  rate  being  the  same  as  the  total 
flow  rate  applied  into  the  four  injection  wells  in  the  field,  i.e.,  4.5524m3 /day,  which 
is  equivalent  to  a  mean  Darcy  flux  in  the  longitudinal  direction,  qx  =  0.8535  m/day. 

Figure  4.3  shows  a  plan  view  of  the  simulation  domain  whose  dimensions  are 
3.556m  longitudinally,  3.5m  transversely,  and  1.524m  vertically.  A  pulse  of  tracers, 
i.e.,  bromide  (KN  =  0)  and  2,2-dimenthyl-3-pentanol  (DMP,  KN  =  12.9)  was  in- 
troduced at  the  inflow  boundary  for  a  length  of  3.3  hours  (or  0.1375  days).  The 
volumetric  water  content,  6W  —  0.21,  is  assumed  to  be  spatially  and  temporally  con- 
stant. Tracers  transported  by  water  undergo  advection,  dispersion,  and  sorption  (for 
the  case  of  partitioning  tracers.)  Mathematical  description  of  these  processes  were 
presented  in  Chapters  2  and  3. 

Using  the  first-order  stochastic  perturbation  methods,  a  Kalman  filtering  algo- 
rithm was  developed  to  describe  the  spatial-temporal  behavior  of  the  first  and  second 
moments  of  the  state  variables  which  include  solute  concentration  c(x,  t),  components 
of  Darcy  flux  ^(x)  ,i=l,2,3,  and  the  logarithm  of  volumetric  NAPL  content  ln0n(x) 
[see  Chapter  2  and  3  for  details.]  A  composite  state  5n  x  1  vector  Z  is  thus  defined 
as: 

Z  =  [CT(x,0,qr(x),q^(X),qnx),ln^(x)]r  (4.1) 

where  superscript  "T"  represents  the  operation  of  matrix  transpose,  and  n  is  the 
number  of  blocks  into  which  the  simulation  domain  is  discretized. 
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Figure  4.3:  Plan  view  of  the  simulation  domain. 
The  following  assumptions  are  required  in  deriving  the  stochastic  model: 

•  Hydraulic  conductivity  K  is  randomly  distributed  in  space,  and  time-invariant. 
Its  logarithm,  In  K,  is  represented  by  an  unconditionally  stationary,  statistically 
isotropic,  random  field  whose  covariance  structure  is  characterized  by  a  negative 
exponential  function. 

•  The  unconditional  logarithm  of  volumetric  NAPL  content,  ln0n,  is  a  time- 
invariant,  stationary,  statistically  isotropic  random  field,  characterized  by  a 
negative  exponential  function. 
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•  Random  fields  \nK  and  ln#n  are  assumed  to  be  correlated  obeying  the  rela- 
tionship, ln#„  =  y  +  Sy  =  y  +  (pf  +  y/l  -  p2  t],  where  y  and  Sy  represent  the 
mean  and  a  zero-mean  perturbation  of  ln#n,  respectively;  C  =  ^f^")  m  which 
cin9„  and  a\„x  are  the  standard  deviations  of  the  respective  random  fields;  / 
represents  the  zero-mean  perturbation  field  of  In  K,  r)  represents  a  zero-mean, 
spatially  correlated  random  field;  p  (  —  1  <  p  <  1)  is  a  weighting  factor  that  spec- 
ifies how  random  fields  In  if  and  \n8n  are  correlated.  Note  that  when  p  —  0, 
In  K  and  ln#n  are  uncorrelated,  and  that  when  p  >  0  (or  p  <  0),  In  K  and  ln#n 
are  positively  (or  negatively)  correlated. 

•  The  small  perturbation  assumption  applies  to  all  random  fields. 

•  The  spatial  variation  of  Darcy  flux  is  caused  solely  by  the  variation  of  hydraulic 
conductivity.  Prior  to  conditioning,  ,  i=l,2,3  are  each  assumed  to  be  station- 
ary random  fields,  characterized  by  expressions  presented  by  Rubin  and  Dagan 
[107]. 

•  Dispersion  coefficients  Dij,  partitioning  coefficients  K^,  volumetric  water  con- 
tent 9W,  tracer  injection  concentration  q,,  time  of  injection  tb,  and  total  flow 
rate  Q,  are  all  assumed  to  be  deterministic  quantities. 

•  Initial  and  boundary  conditions  are  known  perfectly. 

Concentration  measurements  c*(x*,t,)  were  taken  at  discrete  times,  and  were 
modeled  by  the  measurement  equation  presented  in  Chapter  4.  Measurement  error 
standard  deviation  was  assumed  to  be  constant  in  time  and  equal  to  2%  of  the  initial 
normalized  concentration.  Concentration  measurements  of  bromide  and  DMP  were 
temporally  interpolated  from  actual  measurement  times  to  measurement  times  of  0.48 
days  and  1.03  days  in  order  to  create  a  spatial  "snap-shot"  of  49  measurements  at 
each  time. 
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4.4.2  Initialization 

The  parameters  used  in  the  stochastic  inverse  study  are  summarized  in  Table 
4.2.  Table  4.3  lists  the  spatial  locations  for  12  multilevel  samplers  in  OU-1  test  cell  and 
in  the  simulation  domain,  while  Table  4.4  lists  the  relative  locations  of  color  codes 
used  to  designate  the  vertical  measurement  points  along  each  multilevel  sampler. 
Small  discrepancies  exist  for  the  vertical  locations  of  the  sampling  points  between 
the  experimental  and  simulation  domain  due  to  discretization  limitations.  This  is 
not  expected  to  significantly  affect  the  estimation  of  NAPL  distribution  since  the 
differences  are  relatively  small  compared  to  the  assumed  correlation  length  of  the 
\n9n  random  field. 

Preliminary  statistical  inference  on  the  Sn  data  estimated  by  Annable  et  al.  [5] 
from  MLS  partitioning  tracer  data  suggests  that  In  6n  field  has  a  vertical  correlation 
length  Xy  of  approximate  1  ft.  (or  0.3028  m),  and  a  mean  of  0.046  and  a  standard 
deviation  of  0.8637.  Statistics  for  the  unconditional  hydraulic  conductivity  field  how- 
ever, are  not  available.  For  the  purpose  of  this  study,  the  unconditional  mean  of 
\nK  field  was  estimated  based  on  the  cell  hydraulic  test  described  previously.  Fur- 
thermore, the  In  K  field  was  assumed  to  be  characterized  by  a  negative-exponential 
covariance  function  with  the  same  correlation  length  as  that  of  ln0n  field  (1  ft).  The 
unconditional  standard  deviation  of  In  K  was  assumed  to  be  1  which  is  comparable  to 
that  of  \n6n  field.  Since  there  was  no  information  available  to  determine  the  correla- 
tion between  In  K  and  ln#n  random  fields,  these  random  fields  were  initially  assumed 
to  be  statistically  uncorrelated.  Simulation  of  perfectly  negative  and  weakly  negative 
correlation  cases  was  also  conducted  to  determine  the  sensitivity  of  the  estimates  to 
these  assumptions. 

The  extended  Kalman  filter  algorithm  must  be  initialized  at  time  t  =  0.  The 
initialization  was  done  by  setting  the  initial  conditional  moments  equal  to  the  uncon- 
ditional moments.  In  the  OU-1  tracer  tests,  the  tracer  concentration  was  equal  to 
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Table  4.2:  Input  parameters  for  stochastic  simulation  using  OU-1  IWPT  data 


Parameter  value 


Simulation  domain  size 

o.ooom  x  o.oin  x  l.oz^iiii 

Discretization: 

I\x  =  14,     fSy  =  y,     iv2  —  I 

A           f>  OKA  v».     A             fl  QQfi  m 

i\x  =  U.zo4  m,  iAy  —  U.ooy  m 

Az  =  U.ziom 

lotal  1  low  rate  (4-day  average) 

Q  =  4.ooz4m  /aa?/ 

Time  of  injection 

lb  — o.o  nrs. — u.io<o  aays 

Partitioning  coefficient  of  DMP 

a  /v  —  lz.y 

Volumetric  water  content 

"w  —  U.Z1 

Mean  hydraulic  gradient 

J  =  a  x  ID 

Longitudinal  (transverse)  dispersivity 

aL  (ar)  =  0.1  m 

iviean  nyaraunc  conciucuvny 

A'    —  17  1  m/Hav 

j\p  —  ii.i  iii/ciciy 

\x\K  Standard  deviation 

<T/  =  1.0 

In  K  correlation  length 

Xf  =  0.3048  m 

Mean  of  unconditional  Sn 

5„  =  0.046 

Unconditional  In  9n  standard  deviation 

CTy  =  0.8637 

In  9n  correlation  length 

Xy  =  0.3048  m 

Unconditional  mean  of  Darcy  flux  components: 

<7x  =  0.8535  m/day 

Qy  =  Qz  =  0.0 

Mean  of  r] 

77  =  0 

Standard  deviation  of  rj 

a,  =  0.8636 

Correlation  length  of  r\ 

A„  =  0.3048  m 

zero  prior  to  injection,  hence  at  t0  =  0,  the  concentration  moments  were  initialized 
as  follows: 

c(x,tb|to)  =  0  (4.2) 
Pcc(x,x',£0|*o)  =  P,iC(x,x',*0|*o)  =  Pyc(x,x-',to\t0)  =  0,    for  i=l,2,3. 

(4.3) 

equation  (4.3)  states  that  at  time  t0  =  0  all  of  the  concentration  second  moments  are 
equal  to  zero  as  a  result  of  that  the  initial  concentration  values  are  known  perfectly. 

The  unconditional  means  of  the  flux  components  were  estimated  from  the 
total  pumping  rate  Q,  and  they  were  used  to  initialize  the  corresponding  Darcy  flux 
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Table  4.3:  Locations  of  multilevel  samplers  in  experimental  and  simulation  domain 


MLS 

Experiment  Domain 

Simulation  Domain 

x  (meters) 

y  (meters) 

x  (meters) 

y  (meters) 

MLS01 

0.635 

A   C  O 

0.58 

a  a  o  c 

0.635 

0.583 

MLS02 

1  one 

1.395 

0.5o 

1.397 

f~\  COO 

0.583 

MLS03 

2.165 

0.58 

2.159 

0.583 

MLS04 

2.925 

0.58 

2.921 

0.583 

MLS05 

0.635 

1.74 

0.635 

1.750 

MLS06 

1.395 

1.74 

1.397 

1.750 

MLS07 

2.165 

1.74 

2.159 

1.750 

MLS08 

2.925 

1.74 

2.921 

1.750 

MLS09 

0.635 

2.90 

0.635 

2.917 

MLS10 

1.395 

2.90 

1.397 

2.917 

MLS11 

2.165 

2.90 

2.159 

2.917 

MLS12 

2.925 

2.90 

2.921 

2.917 

moments: 

yxi*<>) 

%(y\to) 
f«(y|*o) 

In  equation  (4.4),  ly  x  lz  represents  the  area  of  the  cross-section  perpendicular  to  the 
mean  flow  direction.  The  flux  covariance  matrix  was  is  initialized  with  the  expression: 

49.(x,x'|*0)  =  PM,(x,x'),       for  i,j=l,2,3.  (4.7) 

The  conditional  mean  of  In  9n  was  initialized  with  its  unconditional  mean  based  on 
the  volume-averaged  value  of  NAPL  residual  saturation  Sn\Totai  —  0.046  for  the  entire 
test  cell  estimated  by  Annable  [3]  using  the  temporal  moment  method  [72].  Recalling 
that  6n  =  jz^Ow,  and  y  —  ln#n,  this  value  yields: 

£(x|t„)  =  y(x)  =  -4.6193  (4.8) 


9x(x)  = 


Q 


ly     X  lZ 


qy(x)  =  0 
9z(x)  =  0 


(4.4) 
(4.5) 
(4.6) 
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Table  4.4:  Color  codes  corresponding  to  elevations  above  clay  for  multi-level  sampler 
measurement  points  in  experimental  and  simulation  domain. 


Elevation  above  clay  (m) 
Color     Experiment  domain    Simulation  domain 


Yellow  0.0  0.109 

White  0.38  0.327 

Red  0.76  0.762 

Blue  1.14  1.197 

Black  1.52  1.415 


Similarly,  the  conditional  auto-covariance  Pyy  and  cross-covariance  Pqiy  are 
initialized  with  their  respective  unconditional  values: 

Pyy(x,x/|t0)  =  Pyy(x,x')  (4.9) 
Pqiy(x,x'\to)  =  P,iS(x,x'),       for  i=l,2,3.  (4.10) 
Note  that  closed  form  expressions  for  Pqiqp  Pyy,  and  PqiV  are  given  in  Appendix  A. 

4.5    Discussion  of  the  Results 
4.5.1    Estimated  parameter  distributions 

Figure  4.4  plots  the  estimated  NAPL  residual  saturation  Sn  located  five  hori- 
zontal layers  at  0.109,  0.327,  0.762,  1.197,  and  1.415  m  above  the  clay  bed,  approxi- 
mately corresponding  to  the  five  vertical  sampling  points  along  MLSs  color-coded  by 
yellow,  white,  red,  blue,  and  black,  respectively.  Figure  4.5  plots  three  vertical  slices 
of  the  estimated  Sn  distribution  located  at  0.583,  1.750,  and  2.917  m  from  the  origin 
in  y-direction,  which  are  in  line  with  EW-1,  EW-2,  and  EW-3,  respectively.  Figure 
4.23  summarizes  the  estimated  NAPL  residual  saturation  Sn  at  five  MLS  elevations. 
Note  that  NAPL  residual  saturation,  Sn  =  „  °™  ,  is  plotted,  however,  ln#n  is  the 
variable  directly  estimated  by  the  inverse  algorithm.  In  general,  the  estimated  NAPL 
saturation  is  lower  in  the  upper  portion  of  the  test  cell  {e.g.,  blue  and  black  layers), 
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and  higher  near  the  clay  confining  layer  (e.g.,  yellow  and  white  layers.)  This  is  be- 
lieved to  be  the  result  of  the  historical  fluctuations  of  the  water  table  just  above  the 
confining  layer  [66].  A  trend  of  increase  in  Sn  from  EW-1  side  of  the  cell  to  EW-3 
side  of  the  cell  is  clearly  noticeable.  A  very  high  NAPL  region  with  Sn  being  approx- 
imately 12%  is  detected  near  the  MLS-9  Black  sampling  point  (see  Figure  4.6),  this 
is  consistent  with  the  measured  tracer  breakthrough  curves,  which  shows  the  large 
separation  between  the  partitioning  (DMP)  and  nonpartitioning  (Bromide)  tracers 
at  this  location.  The  model  also  predicts  a  low  NAPL  region  between  MLS-5  Red 
(see  Figure  4.7)  and  MLS-6  Red  points  (see  Figure  4.8),  which  can  be  attributed  to 
the  relatively  small  separation  observed  in  the  tracer  breakthrough  curves  measured 
at  MLS-6  versus  the  larger  separation  observed  at  upstream  MLS-5. 

Figures  4.9  and  4.10  show  \n8n  estimation  uncertainties  at  the  same  planes 
shown  in  Figures  4.4  and  4.5.  These  figures  indicate  that  NAPL  estimates  obtained 
in  upstream  regions  are  generally  associated  with  lower  uncertainties.  Downstream, 
immediately  after  the  last  row  of  MLSs,  the  prediction  uncertainties  approach  the  un- 
conditional In  9n  standard  deviation  because  very  limited  information  can  be  gained 
in  this  region  from  available  concentration  measurements.  The  Sn  distribution  esti- 
mated by  the  stochastic  model  has  a  sample  mean  of  0.0476  and  a  sample  standard 
deviation  of  0.0225,  leading  to  a  coefficient  of  variation  of  0.473.  The  fact  that  the 
estimated  average  Sn  (0.0476)  is  very  close  to  Sn\Totai  =  0.046  estimated  by  Annable 
et  al.  [3]  suggests  the  model  performs  well  and  recovers  the  correct  volume  of  NAPL. 

Figure  4.11  shows  the  x-y  component  of  the  estimated  Darcy  flux  vector, 
i.e.,  qxy  =  qx  +  q,,,  at  at  five  horizontal  layers  at  0.109,  0.327,  0.762,  1.197,  and 
1.415m  above  the  clay  bed,  each  approximately  corresponding  to  the  five  vertical 
sampling  points  along  MLSs  color-coded  by  yellow,  white,  red,  blue,  and  black,  re- 
spectively. Figure  4.12  shows  x-z  component  of  the  estimated  Darcy  flux  vector,  i.e., 
Qi2  =  Qx  +  q2,  at  three  vertical  slices  in  line  with  EW-1,  EW-2,  and  EW-3.  The 
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estimated  flux  distribution  clearly  shows  the  presence  of  a  higher  flux  zone  in  the 
center  of  the  test  cell  extending  vertically  from  the  clay  confining  unit  to  approxi- 
mately 0.327m  (white  layer)  above  the  clay.  In  general,  the  EW-3  side  of  the  cell  has  a 
higher  flux  compared  to  the  EW-1  side.  Flux  distribution  estimated  by  the  stochastic 
model  compares  favorably  with  that  inferred  from  field  hydraulic  tests  discussed  by 
Annable  et  al.  [4]  and  Sillan  et  al.  [112].  Figure  4.13  shows  the  conditional  mean 
concentration  of  bromide  and  2,2-dimethyl-3-pentanol  on  layer-red  (0.762m  above  the 
clay  confining  unit)  at  conditioning  times  0.48  days  and  1.03  days.  Again,  it  shows 
that  nonpartitioning  tracer  bromide  travels  faster  in  the  EW-3  side  of  the  cell. 

In  a  second  simulation,  the  \n9n  and  In  AT  fields  are  assumed  to  be  perfectly 
negatively  correlated,  which  implies  that  the  residual  NAPL  tends  to  reside  in  small 
pores  where  the  effective  saturated  hydraulic  conductivity  is  small.  The  results  are 
plotted  in  Figures  4.14  through  4.16.  The  estimated  Sn  distribution  has  a  sample 
mean  of  0.0546,  a  sample  standard  deviation  of  0.0322,  and  a  coefficient  of  variation 
of  0.590,  suggesting  that  slight  overestimation  has  occured  compared  to  the  the  soil 
core  data.  Although  the  estimated  NAPL  distribution  displays  similar  trend  as  that 
of  soil  core's,  the  Sn  maps  (Figures  4.14  and  4.15)  contain  more  high  NAPL  satura- 
tion areas  throughout  the  domain.  Uncertainties  associated  with  Sn  estimation  are 
lower  than  the  uncorrected  \n9n-\nK  case.  This  is  consistent  with  results  of  the 
unconditional  moment  analyses  (Chapter  2)  which  found  that  the  perfectly  negative 
correlation  assumption  tends  result  in  the  strongest  correlation  between  the  measured 
concentration  and  NAPL  to  be  estimated. 

The  third  simulation  (Figures  4.18  and  4.19)  assumes  that  ln#n  and  \nK 
fields  are  weakly  negatively  correlated,  i.e.,  p  =  -0.25.  The  resulting  Sn  field  with  an 
average  Sn  equal  to  0.0486,  a  standard  deviation  equal  to  0.0227,  and  coefficient  of 
variation  equal  to  0.467,  appears  quite  similar  to  the  uncorrelated  case.  Comparison  of 
all  three  cases  seems  to  suggest  that  with  the  increased  degree  of  negative  correlation 
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between  In  9n  and  In  K  fields,  the  resulting  NAPL  distribution  becomes  increasingly 
variable. 

4.5.2    NAPL  distribution  based  on  soil  core  and  moment  analysis 

During  well  and  MLS  installation,  a  total  of  120  soil  samples  were  collected  and 
used  to  quantify  initial  NAPL  constituent  mass  present  in  the  test  cell.  As  indicated  in 
Figure  4.1,  cores  were  collected  at  the  injection  wells,  extraction  wells,  and  center  row 
of  MLSs.  Laboratory  analysis  on  the  LNAPL  sampled  from  extraction  wells  suggested 
that  the  NAPL  at  OU-1  has  numerous  constituents  (>  200)  [112].  Table  4.5  lists  a 
few  target  analytes  selected  for  monitoring  during  the  study  that  represent  various 
chemical  classes  and  mass  fractions  in  the  NAPL.  Using  the  average  concentration  of 
each  target  constituent  in  the  cell  and  measured  mass  fractions  of  these  constituents 
in  the  NAPL,  estimates  of  the  average  residual  NAPL  saturation  within  the  cell  were 
made  (Table  4.6)  using  the  following  equation: 

-*t  PNAPL  <P 

where  Csoii  is  the  soil  concentration  of  a  NAPL  constituent;  A,  represents  the  mass 
fraction  for  the  constituent;^  w  1.7 gjcvr?  is  the  soil  bulk  density,  while  Pnapl  ~ 
0.85  g/cm3  is  the  density  of  NAPL;  <j)  is  the  cell  porosity.  Table  4.6  summarizes  this 
data  and  indicates  considerable  discrepancies  among  Sn  estimations  from  different 
constituents,  as  several  constituents  have  very  low  mass  fractions  that  may  lead  to 
unreliable  estimates.  The  most  reliable  are  1,2-dichlorobenzene  and  n-decane  that 
have  relatively  high  mass  fractions  in  the  NAPL. 

Figure  4.20  and  Figure  4.21  present  the  soil  concentration  data  from  soil  cores 
for  1,2-dichlorobenzene  (DCB)  and  n-decane,  respectively.  Since  soil  concentration 
linearly  relates  to  Sn  (equation  (4.11)),  the  distribution  of  NAPL  may  be  qualitatively 
inferred  from  these  data.  Analysis  of  soil  cores  indicated  the  following:  (1)  the  NAPL 
was  located  in  a  1.5-meter  smear  zone  immediately  above  the  clay  confining  unit;  (2) 
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Table  4.5:  Target  analyte  concentration  in  an  LNAPL  sample  collected  from  OU-1 
test  cell. 


Target  Analyte 

Mass  Fraction  fme'/fj') 

p-,m-xylene 

0.17 

o-xylene 

0.85 

1 ,2-dichlorobenzene 

6.10 

1,3,5-trimethylbenzene 

0.83 

1 ,2,4-trimethylbenzene 

0.71 

naphthalene 

0.11 

1,2,4-trichlorobenzene 

11.81 

n-decane 

5.20 

n-undecane 

16.00 

n-tridecane 

2.93 

Table  4.6:  Comparison  of  NAPL  saturations  predicted  by  the  soil  core  analysis  and 
partitioning  tracers. 


NAPL  Constituent 

Mass  Fraction  in 

Average  Soil 

NAPL 

NAPL 

Concentration  (mg/Kg) 

saturation 

1,1,1-trichloroethane 

0.00016 

1.0 

0.21 

toluene 

0.00074 

3.0 

0.14 

naphthalene 

0.0011 

2.3 

0.076 

1 ,2-dichlorobenzene 

0.0061 

32 

0.016 

n-decane 

0.0052 

67 

0.041 

the  largest  amounts  of  NAPL  were  near  the  water-table  position  prior  to  cell  instal- 
lation; (3)  NAPL  penetrated  about  0.5  m  into  the  confining  unit  via  sand  stringers 
near  the  aquifer-aquitard  interface  [99,  112].  Overall,  the  residual  NAPL  saturation 
is  lower  in  the  upper  portion  of  the  test  cell,  highest  near  the  clay  confining  unit,  and 
greater  on  the  EW-3  side  (MLS  9-12)  of  the  cell  than  on  the  EW-1  side  (MLS  1-4). 
This  is  consistent  with  the  trends  predicted  by  the  optimal  estimation  algorithm. 
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Annable  et  al.  [3]  conducted  moment  analyses  on  the  EW  data  for  bromide 
and  2,2-dimethyl-3-pentanol  to  estimate  the  average  NAPL  saturation  and  the  total 
NAPL  volume  within  the  swept  zones  for  each  of  the  three  extraction  wells  (Table 
4.7).  Comparison  of  Sn  estimates  based  on  data  for  the  individual  extraction  wells 
also  indicates  a  general  trend  of  increasing  NAPL  contents  from  the  EW-1  side  to  the 
EW-3  side  of  the  cell.  This  is  consistent  with  the  location  of  the  suspected  source  of 
NAPL  (i.e.,  CPDs)  relative  to  the  cell.  The  total  swept  volume  of  4.96m3  yields  a 
cell- averaged  mobile  water  content  of  0.21,  which  was  used  in  the  stochastic  optimal 
estimation  algorithm.  Note  that  of  the  total  swept  volume,  approximately  41%  is 
captured  by  EW-1  with  the  other  wells  at  approximately  29%  each.  These  data 
suggest  that  the  EW-1  side  of  the  test  cell  contains  less  permeable  regions  compared 
to  the  EW-3  side. 

Table  4.7:  Mass  recovery,  well  swept  volume,  and  average  NAPL  saturation  (after 
Annable  et  al.  [3]). 


Bromide 

Bromide 

Well 

Mass 

Swept  Volume 

Sn 

Recovery 

K) 

(%) 

Data 

Extrapolated 

Data 

Extrapolated 

EW-1 

35 

2.03 

2.11 

0.029 

0.031 

EW-2 

35 

1.43 

1.48 

0.050 

0.049 

EW-3 

32 

1.50 

1.72 

0.065 

0.088 

Total 

102 

4.96 

5.32 

0.046 

0.054 

Tracer  breakthroughs  of  bromide  and  2,2-dimethyl-3-pentanol  measured  at 
multilevel  sampling  points  for  the  same  tracer  tests  were  also  used  by  Helms  et  al.  [66] 
to  characterize  the  NAPL  distribution.  Measurements  at  MLS  points  were  considered 
to  represent  resident  concentration,  and  the  resulting  Sn  from  the  moment  analysis 
were  viewed  as  representative  of  the  zone  between  tracer  injection  and  the  monitoring 
point.  Assuming  that  the  tracer  flow  path  taken  is  representative  of  the  general  zone 
between  each  measurement  location,  Helms  et  al.  [66]  used  a  "differential  retardation" 
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approach  to  determine  the  differential  values  of  average  NAPL  saturation  between 
consecutive  monitoring  points.  Helms'  results  (Figure  4.22)  on  the  NAPL  distribution 
compared  favorably  with  that  from  soil  core  data  and  from  the  optimal  estimation 
algorithm  (see  Figure  4.23  which  plots  the  estimation  of  NAPL  residual  saturation 
Sn  at  five  MLS  elevations  based  on  the  stochastic  inverse  methods).  Overall,  the 
estimated  NAPL  distribution  is  in  general  agreement  with  both  the  analyses  of  soil 
core  samples  [99,  112]  and  those  predicted  by  the  temporal  moment  methods  [3,  66]. 
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Figure  4.4:  Estimation  of  NAPL  residual  saturation  Sn  distribution  based  on  stochas- 
tic inverse  methods.  Shown  above  are  five  horizontal  layers  (0.109,  0.327,  0.762,  1.197, 
and  1.415  m  above  the  clay  corresponding  to  layer  yellow,  white,  red,  blue,  and  black, 
respectively.)  p  =  0. 
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Figure  4.5:  Estimation  of  NAPL  residual  saturation  Sn  distribution  based  on  stochas- 
tic inverse  methods.  Shown  above  are  three  vertical  slices  Y2,  Y5,  and  Y8  (0.583, 
1.750,  and  2.917  m  from  the  origin  in  y-direction,  respectively.)  p  =  0. 
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Figure  4.6:  Breakthrough  curves  of  nonpartitioning  tracer  bromide  and  partitioning 
tracer  DMP  at  MLS09  measured  in  OU-1  field  tracer  experiments. 
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Figure  4.7:  Breakthrough  curves  of  nonpartitioning  tracer  bromide  and  partitioning 
tracer  DMP  at  MLS05  measured  in  OU-1  field  tracer  experiments. 
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Figure  4.8:  Breakthrough  curves  of  nonpartitioning  tracer  bromide  and  partitioning 
tracer  DMP  at  MLS06  measured  in  OU-1  field  tracer  experiments. 
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Figure  4.9:  Contour  of  estimation  error  of  ln#„  at  five  horizontal  layers  (0.109,  0.327, 
0.762,  1.197,  and  1.415  m  above  the  clay  corresponding  to  layer  yellow,  white,  red, 
blue,  and  black,  respectively.)  p  =  0. 
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Figure  4.10:  Contour  of  estimation  error  of  ln(9„  at  three  vertical  slices  Y2,  Y5,  and 
Y8  (0.583,  1.750,  and  2.917  m  from  the  origin  in  y-direction,  respectively.)  p  =  0. 
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Figure  4.11:  Vector  plot  of  x-y  component  (qXJ/)  of  estimated  Darcy  flux  at  five 
horizontal  layers  (0.109,  0.327,  0.762,  1.197,  and  1.415  m  above  the  clay  corresponding 
to  layer  yellow,  white,  red,  blue,  and  black,  respectively.)  p  =  0. 
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Figure  4.12:  Vector  plot  of  x-z  component  (qX2)  of  estimated  Darcy  flux  at  three 
vertical  slices  Y2,  Y5,  and  Y8  (0.583,  1.750,  and  2.917  m  from  the  origin  in  y-direction, 
respectively.)  p  =  0. 
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Figure  4.13:  Simulated  mean  concentration  contours  of  Bromide  and  2,2-Dimethyl-3- 
Pentanol  at  Layer-Red  (0.762  m  above  the  clay)  at  conditioning  times  0.48  days  and 
1.03  days,  (p  =  0.) 
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Figure  4.14:  Estimation  of  NAPL  residual  saturation  Sn  distribution  based  on 
stochastic  inverse  methods.  Shown  above  are  five  horizontal  layers  (0.109,  0.327, 
0.762,  1.197,  and  1.415  m  above  the  clay  corresponding  to  layer  yellow,  white,  red, 
blue,  and  black,  respectively.)  p  =  —  1. 
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Figure  4.15:  Estimation  of  NAPL  residual  saturation  Sn  distribution  based  on 
stochastic  inverse  methods.  Shown  above  are  three  vertical  slices  Y2,  Y5,  and  Y8 
(0.583,  1.750,  and  2.917  m  from  the  origin  in  y-direction,  respectively.)  p  =  —1. 


172 


Figure  4.16:  Contour  of  estimation  error  of  ln#„  at  five  horizontal  layers  (0.109,  0.327, 
0.762,  1.197,  and  1.415  m  above  the  clay  corresponding  to  layer  yellow,  white,  red, 
blue,  and  black,  respectively.)  p  =  —  1. 
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Figure  4.17:  Contour  of  estimation  error  of  ln#n  at  three  vertical  slices  Y2,  Y5,  and 
Y8  (0.583,  1.750,  and  2.917  m  from  the  origin  in  y-direction,  respectively.)  p  =  —1. 
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Figure  4.18:  Estimation  of  NAPL  residual  saturation  Sn  distribution  based  on 
stochastic  inverse  methods.  Shown  above  are  five  horizontal  layers  (0.109,  0.327, 
0.762,  1.197,  and  1.415  m  above  the  clay  corresponding  to  layer  yellow,  white,  red, 
blue,  and  black,  respectively.)  p  =  —0.25. 
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Figure  4.19:  Estimation  of  NAPL  residual  saturation  Sn  distribution  based  on 
stochastic  inverse  methods.  Shown  above  are  three  vertical  slices  Y2,  Y5,  and  Y8 
(0.583,  1.750,  and  2.917  m  from  the  origin  in  y-direction,  respectively.)  p  =  —0.25. 
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Figure  4.20:  1,2-dichlorobenzene  (DCB)  concentration  distribution  in  soil  cores  taken 
prior  to  cosolvent  flushing  (after  Sillan  et  al.  [112]). 
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Figure  4.21:  n-decane  concentration  distribution  in  soil  cores  taken  prior  to  cosolvent 
flushing  (after  Sillan  et  al.  [112]). 
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Figure  4.22:  Estimation  of  NAPL  saturation  variability  from  MLS  tracer  data  using 
temporal  moment  methods,  (after  Helms  et  al.  [66]) 
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Figure  4.23:  Estimation  of  NAPL  residual  saturation  Sn  at  five  MLS  elevations  based 
on  stochastic  inverse  methods,  p  —  0. 
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4.6  Summary 

Results  of  this  chapter  show  that  the  three-dimensional  stochastic  optimal  es- 
timation algorithm  can  be  successfully  used  to  estimate  the  hydrogeochemical  param- 
eters of  the  OU-1  test  cell  at  Hill  AFB,  Utah.  Using  a  small  subset  of  the  available 
tracer  concentration  measurements,  distributions  of  NAPL  residual  saturation  and 
Darcy  flux  can  be  characterized.  The  estimated  NAPL  and  flux  distributions  show 
good  agreement  with  those  predicted  via  soil  coring  observations  and  temporal  mo- 
ment analyses.  It  is  noted  that  the  sorption  process  modeled  in  this  dissertation  is 
the  total  sorption  which  may  interfer  with  background  adsorption  to  porous  media. 

The  stochastic  optimal  estimation  algorithm  selectively  uses  field  concentra- 
tion observations  of  a  nonpartitioning  tracer  (bromide)  and  a  partitioning  tracer  (2,2- 
dimethyl-3-pentanol)  from  49  multilevel  sampling  points.  Because  the  primary  pur- 
pose of  this  effort  was  parameter  estimation,  the  conditioning  process  was  carried  out 
at  only  two  measuring  times  in  order  to  reduce  computational  efforts.  Justification 
for  using  a  minimal  number  conditioning  times  was  given  in  the  previous  chapters. 
In  general,  lower  NAPL  saturations  were  found  in  the  upper  portion  of  the  test  cell 
(e.g.,  blue  and  black  layers),  and  higher  NAPL  saturations  were  found  near  the  clay 
confining  layer  [e.g.,  yellow  and  white  layers.)  A  trend  of  increasing  in  Sn  from  the 
EW-1  side  of  the  cell  to  the  EW-3  side  of  the  cell  was  predicted.  The  algorithm  also 
generates  conditional  standard  deviations  for  the  estimated  parameters  which  provide 
a  measure  of  estimation  reliability  (or  uncertainty) .  The  reduction  in  uncertainty  for 
the  uncorrelated  case  was  found  to  range  from  0.2637  near  measurement  locations  to 
0.0637  away  from  measurement  locations. 

This  Kalman  filtering  algorithm  developed  here  provides  a  means  of  combin- 
ing physically  based  models  with  field  data  using  the  Bayesian  conditioning  theory. 
Although  in  this  study  only  the  concentration  measurements  were  available  to  condi- 
tion the  Darcy  flux  and  NAPL  fields,  inclusion  of  head,  flux,  or  hydraulic  conductivity 
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measurements  can  be  easily  implemented  in  the  model  and  is  expected  to  improve 
the  quality  of  the  estimated  parameters  [63].  Overall,  this  model  provides  an  ef- 
fective tool  to  characterize  hazardous  waste  sites  and  thus  benefits  the  design  and 
implementation  of  field  remediation  activities. 

Because  of  its  three-dimensional  nature,  the  algorithm  demands  large  com- 
puter resources  primarily  due  to  the  storage  requirement  of  the  large  state  variable 
covariance  matrix.  Computational  effort  for  the  problem  examined  here  is  moderate 
on  a  SUN  Ultra-Sparc  Work  Station.  However,  future  studies  using  faster  computers, 
and  more  reliable  numerical  solution  algorithms  will  be  necessary  in  order  to  improve 
computation  accuracy  by  increasing  the  discretization  resolution. 


CHAPTER  5 
CONCLUSIONS 


This  dissertation  demonstrates  a  stochastic  approach  to  investigate  the  transport  of 
a  partitioning  tracer  in  a  three-dimensional  heterogeneous  aquifer  contaminated  by 
nonaqueous  phase  liquids  (NAPL).  A  distributed  parameter  extended  Kalman  filter 
is  developed  to  identify  the  distributions  of  NAPL  residual  saturation  and  Darcy  flux 
using  measured  partitioning  and  nonpartitioning  tracer  concentrations.  Using  the 
Bayesian  conditioning  theory,  this  algorithm  provides  an  effective  means  of  combining 
the  physically  based  groundwater  flow  and  transport  models  with  field  data  and  hence 
generates  site-specific  predictions.  The  distributed  parameter  extended  Kalman  filter 
computes  estimates  of  the  conditional  ensemble  mean  and  auto-  (cross-)  covariances 
of  the  random  variables  of  interest  at  any  time  or  location.  The  conditional  mean 
is  the  optimal  (minimal-variance,  unbiased)  estimate  of  the  random  variable  in  any 
given  replicate.  The  conditional  covariance  provides  a  measure  of  the  uncertainty  of 
the  estimate,  and  informative  guidance  on  the  design  of  field  monitoring  network  and 
remediation  strategy. 

The  method  assumes  that  tracer  transport  can  be  described  by  the  classical 
advection-dispersion-retardation  equation  where  the  Darcy  flux  is  random  and  the 
retardation  factor  for  the  partitioning  tracers  is  dependent  on  the  randomly  variable 
residual  NAPL  content  and  the  tracer-NAPL  partitioning  coefficient.  Using  a  first- 
order  small-perturbation  assumption,  stochastic  partial  differential  equations  for  the 
ensemble  moments  (mean,  auto-  and  cross-covariances)  of  the  random  concentration, 
Darcy  flux  and  NAPL  fields  are  derived  and  solved  numerically  with  a  finite-difference 
scheme  in  conjunction  with  a  square-root  algorithm.  These  ensemble  moments  are 
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conditioned  at  discrete  times  when  measurements  become  available.  To  reduce  the 
intensive  computational  demand  due  to  the  three-dimensional  natural  of  the  problem, 
a  sequential  filtering  algorithm  is  adopted.  This  is  based  on  the  fact  that  concentration 
breakthrough  data  from  a  partitioning  tracer  provides  information  on  both  NAPL 
and  Darcy  flux,  while  a  nonpartitioning  tracer  provides  information  for  Darcy  flux 
only.  The  procedures  for  the  sequential  filtering  algorithm  are:  first,  condition  the 
Darcy  flux  field  only  on  the  nonpartitioning  tracer  measurements;  second,  use  the 
conditional  Darcy  flux  and  only  condition  the  NAPL  field  on  the  partitioning  tracer 
measurements. 

Results  from  the  unconditional  simulations  lead  to  better  understanding  of 
reactive  solute  migration  in  hydrogeochemically  heterogeneous  porous  formations, 
particularly  the  statistical  correlations  between  the  concentration  observations  and 
the  aquifer  hydrogeochemical  parameters.  Cases  with  varying  correlation  between 
hydraulic  conductivity  and  NAPL  random  fields  are  presented.  Flux  heterogeneity, 
induced  by  the  hydraulic  conductivity  heterogeneity,  is  the  primary  contributor  to 
macrodispersion  and  to  the  spatial  variation  of  concentration  field.  The  presence  of 
NAPL  heterogeneity  can  either  increase  or  decrease  longitudinal  macrodispersion  de- 
pending on  the  sign  of  the  NAPL-conductivity  correlation.  Both  the  magnitude  and 
sign  of  the  NAPL-conductivity  correlation  significantly  affect  the  prediction  uncer- 
tainty The  unconditional  simulation  results  show  that  a  perfectly  negative  NAPL- 
conductivity  correlation  leads  to  the  highest  prediction  uncertainty,  and  the  strongest 
correlation  between  concentration  and  NAPL. 

Analysis  of  cross-correlations  pCJ/(x/,x,  t)  and  pcqi(x',x,t)  reveals  important 
information  for  sampling  network  design  in  site  characterization  using  interwell  par- 
titioning tracers  (IWPT).  The  unconditional  results  suggest  that  all  concentration 
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measurements  taken  at  a  particular  location  contain  approximately  equal  informa- 
tion about  NAPL  and  Darcy  Flux  distributions  within  the  experimental  domain  be- 
cause the  magnitude  of  pC3/(x',x,£)  and  pC9i(x',x,i)  do  not  change  substantially  with 
time.  Hence  a  snap-shot  sampling  scheme,  which  measures  concentration  intensively 
in  space  only  once,  may  be  most  effective  to  produce  accurate  NAPL  and  Darcy  flux 
estimates. 

Demonstration  of  the  distributed  parameter  extended  Kalman  filter  with  syn- 
thetic data  sets  confirms  the  hypothesis  made  in  the  unconditional  moment  analysis. 
The  conditional  simulation  results  indicate  that  the  algorithm  is  able  to  capture  the 
dominant  features  of  the  NAPL  and  Darcy  flux  distributions  and  to  accurately  pre- 
dict the  estimation  error.  Information  drawn  from  concentration  measurements  by 
the  filtering  algorithm  leads  to  the  improved  parameters  primarily  in  the  upstream 
areas  of  the  sampling  locations.  Parameters  estimated  downstream  of  the  last  row  of 
the  sampling  points  approach  their  unconditional  values  and  the  conditional  standard 
deviations  also  approach  the  unconditional  standard  deviation.  Conditioning  based 
on  the  snap-shot  sampling  scheme  produces  the  most  accurate  estimates. 

The  filtering  algorithm  is  also  applied  to  a  field  data  set  from  the  tracer  tests 
conducted  by  UF  researcher  at  OU-1  test  cell  in  Hill  AFB,  Utah.  Informative  maps  of 
NAPL  residual  saturation  and  Darcy  flux  generated  by  the  algorithm  are  qualitatively 
consistent  with  those  from  soil  core  samples  [99,  112]  and  those  predicted  by  the 
temporal  moment  methods  [3,  66]. 

The  approach  developed  here  has  many  potential  practical  applications.  One 
application  is  to  characterize  the  amount  and  spatial  distribution  of  existing  contam- 
inant source  zones  for  designing  remediation  strategies  or  other  general  public  health 
purposes.  The  algorithm  can  also  be  used  to  model  the  existing  contaminant  plume 
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extent  and  movement.  Another  application  is  to  identify  aquifer  hydraulic  parame- 
ters such  as  Darcy  flux  and  hydraulic  conductivity,  which  are  of  great  significance  to 
water  resources  protection  and  management. 

One  of  the  limitations  of  the  filtering  algorithm  is  its  dependence  on  the  small 
perturbation  assumption  which  may  not  be  valid  in  geologic  formations  with  the 
logarithm  of  conductivity  variance  much  greater  than  one.  This  may  be  circum- 
vented by  pre-conditioning  the  velocity  field  with  head  or  conductivity  measurements 
[60],  and/or  pre-conditioning  the  NAPL  field  with  NAPL  measurements.  The  pre- 
conditioning procedure  suggested  by  Graham  [60]  is  similar  to  the  sequential  filtering 
algorithm  adopted  in  this  research.  By  conditioning  the  Darcy  flux  field  with  non- 
partitioning  tracer  measurements,  the  conditional  mean  Darcy  flux  becomes  more 
variable  and  the  conditional  standard  deviation  becomes  smaller,  i.e.,  the  mean  ac- 
counts for  a  large  portion  of  the  total  variability.  Hence  use  of  the  conditional  Darcy 
flux  moments  (mean  and  covariance)  in  the  subsequent  NAPL  conditioning  with  par- 
titioning tracer  measurements  should  improve  the  NAPL  estimates.  Another  possible 
source  of  error  affecting  the  model  output  is  that  it  requires  the  prior  knowledge  of 
the  statistical  parameters  of  the  independent  random  variables,  such  as  the  variance 
and  correlation  length  of  NAPL  and  conductivity  fields,  which  may  be  practically 
difficult  to  obtain. 

Future  investigation  of  more  efficient  solution  techniques  will  be  necessary  be- 
fore this  filtering  algorithm  can  be  applied  extensively  to  three-dimensional  real  world 
applications.  As  pointed  out  in  the  previous  chapters,  implementation  of  the  algo- 
rithm requires  extensive  computer  run  time  and  large  data  storage,  primarily  due  to 
the  large  state  variable  covariance  matrix.  The  dimension  of  the  state  variable  co- 
variance  matrix  increases  as  more  random  variables  are  considered  in  physically  more 
complex  problems.  Accurate  model  predictions  demand  dense  domain  discretization, 
which  also  increases  the  data  storage  as  the  number  of  blocks  increases.  Hence  further 
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research  into  computationally  efficient  simplifying  assumptions  or  more  sophisticated 
numerical  solution  methods  is  required. 


APPENDIX  A 
DERIVATION  OF  FLUX  COVARIANCES 


Assuming  hydraulic  conductivity  is  a  log-normal,  stationary,  random  field,  under 
steady-state  flow  condition,  the  velocity  spectral  density  function  for  three-dimensional 
flow  is  [58] 

Smj  (k)  =  KpmJn  (5im  -  ^)  (V  -  ^)  %(k)  (A.1) 

in  which  k  is  a  wave  number  vector,  Ji  is  the  mean  hydraulic  gradient  in  the  i-th 
direction.    The  geometric  mean  of  the  hydraulic  conductivity  K(x)  is  defined  as, 


Kg  =  exp  [ln[K (x.)]];  and  Sff  is  the  spectral  density  function  for  f.  Assuming  a 
constant  hydraulic  gradient  in  Xi  direction,?. e.,  J,  then  equation  (A.l)  becomes 

5«„  (k)  =  KJJ"  fa  -  **)  fa  -  t£)  a,m  (A.2) 

The  hydraulic  conductivity  covariance  structure  is  described  by  a  negative- 
exponential  function,  i.e., 

P//(s)  =  aJexp(~)  (A.3) 
and  its  spectral  density  function  is 

s"(k)  ■  *mTLf  (A-4) 

where  gj  and  A  are  the  standard  deviation  and  the  correlation  length  of  the  log 
conductivity  field,  respectively;  the  separation  vector  s  is  defined  as  s  =  x  —  x'  ;  k 
is  the  magnitude  of  the  wave  number  vector.  Note  that  components  of  s  and  k  in  a 
three-dimensional  system  are  (si,s2,  s3)  and  (ki,  k2,  k3),  respectively,  and  the  squares 
of  their  magnitudes  are  each  defined  as 

s2  =  si  +  s22  +  si  (A.5) 
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k2  =  k\  +  k\  +  k\  (A.6) 

The  Darcy  flux  covariance  can  be  derived  by  taking  the  inverse  Fourier  trans- 
form of  the  flux  spectral  density  function.  The  general  expression  for  the  three- 
dimensional  flux  covairance  between  two  points  in  space  is 

/+oo     p+oc  P+OO 
/       /      exp  [i(kiSi  +  k2s2  +  k3s3)]  Sqiqj(ku  k2,  k3)dk1dk2dk3  /^.7) 
-oo    J—  oo    J —oo 

where  i  is  the  imaginary  constant  \/^T.  Equation  (A.l)  relates  the  flux  spectral 
density  function  to  the  spectral  density  function  of  log  hydraulic  conductivity.  To 
evaluate  the  flux  covariance,  equation  (A. 4)  is  substituted  into  equation  (A.l),  and 
then  the  resulting  flux  spectrum  is  substituted  into  equation  (A. 7).  The  integral 
expressed  by  equation  (A. 7)  involves  complex  variable,  and  may  be  evaluated  analyt- 
ically, using  contour  integration  in  the  complex  plane  [61].  The  resulting  components 
of  the  3  by  three-dimensional  flux  covariance  matrix  are  listed  below.  Also  listed  are 
the  corresponding  limits  for  the  covariances  as  the  separation  vector  approaches  zero. 
Note  that  x  =  si  ,  y  =  s2  ,  z  =  s3  ,  and  r  =1  s  |. 


P^  =  ■  (36AV  -  36e*AV  +  36A4r5  +  19A3r6  -  c*AV 

+  7AV  +  2Ar8  +  r9  -  360AW  +  360eXAW  -  360AW 
-  174AW  -  6e*A3rV  -  54AW  -  12ArV  -  2rV 

(A.8) 

+  420A5x4  -  420eU5x4  +  420A4rx4  +  195A3rV  +  15exA3rV 

2^3^.4   .   ln\„4^4   ,  54 


+  55A  W  +  lOArV  +  rbx 
When  r  — >■  0  ,  the  limit  is 


Pqm(0,0,0)  =  ^a}K2gP  (A.9) 
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P<?i?2  =  Kg^r/xrf9y  '  (-!80AV  +  180e^AV  -  180AV  -  87A3r4 
-  3e*A3r4  -  27A2r5  -  6Ar6  -  r7  +  420AV  -  420eU5r2 
+  420A4rx2  +  195A3rV  +  15e*A3rV  (A.10) 
+  55AW  +  lOArV  +  rV  ) 


When  r  — »  0  ,  the  limit  is 


P9192(0,0,0)  =  0  (A.ll) 


f0,0,  =     9  „  {     •  f  -180AV  +  180e*A5r2  -  180A4r3  -  87A3r4 


^r/Ar9 


e' 

-  3e*A3r4  -  27A2r5  -  6Ar6  -  r7  +  420AV  -  420exAV 
+  420A4rx2  +  195A3rV  +  15eU3rV  +  55A2rV  (A.12) 
+  lOArV  +  rV 


When  r  — >  0  ,  the  limit  is 


P9193(0,0,0)  =  0  (A.13) 


K  J  a  ( 

P«"  =  er/Ar/(y2  +  22)  •  (-48A5r4y2  +  48e*A»rV  -  48AVy2  -  22A3r6y2 

-  2eXA3r6y2  -  6A2r7y2  -  \r8y2  +  420A5r2x2y2  -  420e$  \5r2x2y2 
+  420AWy2  +  195A3r4x2y2  +  15eXAWy2  +  55X2r5x2y2 

+  lOArVy2  +  r7x2y2  -  420A5x4y2  +  420^  AVy2  -  420A4rx4y2 

-  195AWy2  -  15e^A3rVy2  -  55X2r3x4y2  -  lOArVy2  (A.14) 

-  rVy2  +  12A5rV  -  12eU5r4z2  +  12A4rV  +  5A3rV 

+  e*AW  +  A2rV  -  60A5rW  +  60ex  A5rW  -  60A4rW 

-  27A3rW  -  3eXA3rVz2  -  7A3rW  -  Xr6x2z2 
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When  r  —>  0  ,  the  limit  is 


^292(0,0,0)  =  ^Xg2J2  (A.15) 


and  when  y  — >  0  ,  2  — >  0  ,  the  limit  is 

r2  1^2  72 


PQ2q2  {x,  0,  0)  =    1   9       ■  f  -48AV  -  6Xx4  -  48A4r  +  48e*  A4r 

-22\2x2r-2ei\2x2r-x4r)  (A.16) 


+     A3r6y2  +  AVy2  -  60A5rV?/2  +  60e*  A5r2a;2?/2  -  60A4r3a;2y2 

-  27A3r4xy  -  3e^X3rAx2y2  -  7X2r5x2y2  -  Xr6x2y2  -  48AW 

+  48e*A5rV  -  48A4rV  -  22A3rV  -  2e^A3rV  -  6A2rV  -  Ar8z2 
+  420A5rW  -  420exA5r2x2^2  +  420A4rVz2  +  195A3r4:rV  (A.17) 
+  15eXA3r4xV  +  55A2r5xV  +  10Ar6x2z2  +  r7x2z2  -  420A5:rV 
+  420e^ AVz2  -  420A4rxV  -  195AW.22  -  15e*A3rVz2 

-  55A2rVz2  -  lOArW  -  r  5a;4z2> 
When  r  — >•  0  ,  the  limit  is 

^393(0,0,0)  =  ^(72X2J2  (A.18) 
and  when  y  — >  0  ,  z  — >  0  ,  the  limit  is 

PQ3Q3  (x,  0,  0)  =  °f^^r%  A  •  ^-48A3a:2  -  6Ax4  -  48A4r  +  48e*  A4r 

-22AVr-2exAVr-x4r^l  (A.19) 


P™  =  ^f/xjr  ■  (-60AV  +  60e*A5r2  -  60A4r3  -  27A3r4  -  3e*A3r4 
-  7AV  -  Ar6  +  420AV  -  420exA5x2  +  420A4rx2  +  195A3rV 
+  15e*A3rV  +  55A2r3x2  +  lOArV  +  rV 


(A.20) 
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When  r  — >  0  ,  the  limit  is 

Pg293(0,0,0)  =  0  (A.21) 

The  statistical  correlation  between  the  random  fields  of  logarithm  of  NAPL 
residual  content  9n  and  logarithm  of  hydraulic  conductivity  K  has  been  proposed  in 
equation  (2.58),  i.e., 


\nen  =  y  +  Sy  =  y  +  (pf+  y/l  -  p2  V  (A.22) 

Using  this  equation  together  with  Darcy's  equation,  the  covariances  between  Darcy 
flux  and  In  9n  can  be  derived  similarly,  and  are  listed  below. 

P     =  KgJ*ff  ■  ( r5  -  4AV  +  4e*  A3*2  -  4A 2rx2 

Vy  er/Ar5  y 


-  2ArV  -  r3x2  +  2X3y2  -  2e$X3y2 
+  2X2ry2  +  \r2y2  +  2AV  -  2e*A3z2 
+  2\2rz2  +  Xr2z2 


When  r  — >  0  ,  the  limit  is 

2 


P  = 

1  Q2V 


(A.23) 


Pqiy(0,0,0)  =  -ajJKgPC  (A.24) 

=  K^PL  ■  (-6A3  +  6eU3  -  6A2r  -  3Xr2  -  r»)  (A.25) 
When  r  — >•  0  ,  the  limit  is 

P92,(0,0,0)  =  0  (A.26) 

K„J pCa2fxz    (      .        T  „  \ 

P™  =      er,xri     ■  (-6A3  +  6eU3  -  6X2r  -  3Xr2  -  r3j  (A.27) 
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When  r  — >•  0  ,  the  limit  is 

P93y(0,0,0)  =  0  (A.28) 

and 


Pyy  =  C2a)e  I 


(A.29) 


APPENDIX  B 
THE  SQUARE-ROOT  DECOMPOSITION  METHOD 

The  covariance  equations  for  Pcc,  PqiC,  i=l,2,3,  and  Pyc  derived  in  the  last  section 
generally  account  for  two  different  location  vectors  denoted  by  x  and  x'.  A  three- 
dimensional  finite  difference  method  is  developed  to  solve  the  system  of  equations. 
The  domain  is  hence  discretized  into  a  number  of  blocks.  The  state  vector  Z  is  defined 


as 


Z  = 


/  C(x,t)  \ 

qi(x) 

Q2(x) 

qa(x) 

VY(x)  J 


(B.l) 


The  composite  state  covariance  matrix  is  thus  expressed  as 


Pzz(t)  =  sz  ■  SZT 


p  p 

1  cc       1  cqi 


P 


aft 


P 


C?3 


cy 


P  P  P  P  P 

1  q\c  1  qiqi  1  qiq2  1  9l<73  rqiy 

P  P  P  P  P 

1  Q2C  1  9291  1  9292  1  9293  1  q2V 

p  p  p  p  P 

1  Q3C  1  9391  1  9392  1  9393  1  q3V 


p„ 


p 


(B.2) 


1  yc     *  yqi     *■  ym     *  yqs     *  yy  J 
where  superscript  "T"  denotes  matrix  transpose,  e.g.,  ZT  =  [CT,     ,     ,  q^,  YT].  Ob- 
viously, the  covariance  matrix  is  symmetric  and  positive-definite. 

The  covariance  equations  can  now  be  simplified  with  the  matrix  notations. 
The  covariance  equation  for  PqiC,    i  =  1, 2, 3,  in  matrix  form,  becomes: 


0 


r^[-^9,c]nm  —  Am[P?iC]nm  +  Bm[P jj?j]nra  +  Em[Pgiy]7 


(B.3) 


where  subscripts  m  and  n  are  used  to  designate  nodal  location  vectors  x  and  x', 
respectively,  m  and  n  =  l,  ...  ,  N,  where  N  is  the  total  number  of  blocks.  Note  that 


193 


194 

Einstein's  convention  of  notation  is  implied  in  the  matrix  operation.  The  operator 
vectors,  U,  A,  B,  and  E  are  defined  as  follows: 

Um  =  Ow(^m)+KNe^  (B.4) 

d 


Am  =  -— [gj(xm)-]  +  — 


dx 


ink 


(B.5) 


Bm  =  -^-  [c(xm,*)-]  (B.6) 
Em  =  -^[-c(xm,t)}  (B.7) 

Neglecting  subscripts  m  and  n,  equation  (B.3)  is  further  reduced  to 

d 


01 


P?,c  —  Fc  •  P9iC  +  FQj  ■  P9igj  +      •  Pgiy  (B-8) 


where 


FC  =  U"1A  (B.9) 
F^U^B  (B.10) 
FV  =  U_1E  (B.ll) 

Similarly  the  matrix  form  of  the  covariance  equation  of  Pyc(x',x,  t)  can  be 
written  as  follows: 

Um  ^^[Pj/c]nm       Am[PyC]nm  +  BjjiJPy^Jny^  +  Em[P jj]nm  (B-12) 


or 


Pl/C    =    FC     •     PyC    +   F^      ■     Pyqt    +   Fy     ■     Pyy  (B.13) 

The  concentration  auto-covariance  equation  Pcc(x',x,  t)  is  written  as 

Um  7T-  [Pcc]nm       AjTjfPcc]^,^  +  Bm[PCgJ,jm  +  Em[pCy]nm 

ot  (B.14) 

[PcJmnAjj  +  [Pc^JmnBn  +  [Pcy]mnEn 


or 


d 

~o7  Pec       Fc  *  Pec  ~l~  Fg?    ■  Pco;  ~l~  F y  Pcv 

dt  (B.15) 


+  P      .  FT  +  P       .  T?T  -L  p       .  PT 
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Equations  (B.8),(B.13),  and  (B.15)  have  the  form  of  the  traditional  time- 
varying  dynamic  system  model  [82].  These  equations  can  be  combined  into  a  more 
composite  form.  Inspecting  the  concentration  perturbation  equation,  i.e., 
cWc(x,  t) 


(ew  +  KNey)- 


dt 


d_ 

dxi 


[iji(x)5c(x,t)]  +—  6wDij—6c(x,t) 


A  composite  operation  matrix  F  is  defined  as  follows: 


F     F  F 

0  0  0 
0       0  0 


Hence  the  state  propagation  equations  have  the  form  of 


(B.16) 


xeD 


(B.17) 


P  =  FP  +  PFJ 


(B.18) 


where  "dot"  represents  the  operation  of  temporal  derivation. 

It  can  be  shown  [82]  that  for  annxn,  synmetric,  positive  semidefinite  matrix 
T,  there  exists  at  least  one  n  x  n  "square  root"  matrix,  denoted  as  vT,  such  that 


vfvf =  r 


(B.19) 


In  fact  there  are  many  matrices  \/T  which  satisfy  equation  (B.19).  The  essential  idea 
of  square  root  decomposition  method  is  to  replace  the  equation  for  the  covariance 
P  with  an  equation  for  its  square  root  \/P  (or  later  denoted  as  A)  in  order  to  save 
computer  time  and  storage.  The  covariance  square  roots  are  not  uniquely  defined  and 
square  root  filters  can  be  formulated  in  terms  of  general  matrix  square  roots.  One 
means  of  exploiting  this  fact  is  to  develop  algorithms  which  maintain  a  particularly 
attractive  square  root  form,  namely  an  upper  or  lower  triangular  matrix  (with  all 
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zeros  below  or  above  the  main  diagonal,  respectively),  thereby  requiring  computation 
storage  of  only  n(n  +  l)/2  instead  of  n2  scalar  variables. 

The  lack  of  uniqueness  does  not  cause  any  difficulties  in  converting  from  a 
problem  description  in  terms  of  initial  P0  and  its  time  history.  The  reason  is  that  any 
positive  semidefinite  matrix  can  be  factored  into  the  product  of  a  lower  triangular 
matrix  and  its  transpose  by  the  Cholesky  decomposition  algorithm  [46].  Although 
equation  (B.19)  does  not  uniquely  define  vT\  a  unique  Cholesky  lower  triangular 
square  root       can  be  defined  such  that:  \/F  \/T   =  F. 

Consider  the  state  covariance  matrix  P22,  its  square- root  decomposition  is 
expressed  as  follows 


P»W=A„(t)  Alz(t) 


(B.20) 


where  Azz  correspoding  to  the  initial  condition  is  a  lower  triangular  matrix.  The 
unknown  matrix  Azz  in  general  is  defined  as 


Azz{t)  = 


ACc 

A 

■^cqi 

A 

■ncq2 

A 

^cq3 

A 

A 

A 

^■q\q\ 

A 

^9i 92 

Aqi  A3 

A 

A 

A 

^9291 

A 

^9292 

Aq2A3 

A 

A 

A 

^939i 

A 

^9392 

^93^3 

A 

A 

A 

^-yqi 

A 

"■yqi 

AyA3 

A 

9iy 

92!/ 
93!/ 


(B.21) 


'■yy 

For  simplicity,  I  denote  6T  =  (qi,q2,q3)T,  rewrite  equation  (B.20)  more  explicitly  as 


P«(*) 


It  follows  that 


ACc 

Ace 

A 

A6c 

Aee 

A 

A 

s^yc 

Aye 

A 

cy 
By 

yy  J 


AT       AT  AT 

cc        9jC  yc 

AT       AT  aT 

m  -^qjqi  -^yq, 

AT       AT  AT 

cy       qty  yy 


(B.22) 


Pcc(^)  —  ACCACC  +  A.cqA.cq  4-  ACyA 


T 


Pcg(t)  =  AMA£ 


AcoAgg  +  A„,A 


T 

cy-n-By 


A  AT  4-  A  aAT  ■+-  A  AT 

^CC^-yc     '     ^Ce-^-yB  ^  ^Cy^-yy 


(B.23) 
(B.24) 
(B.25) 
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P9e{t)  =  ABcAT6c  +  AeeA%e  +  A0yAj;y 
Fgy(t)  =  A6cATyc  +  AoqA^q  +  AeyA^y 

Pyj/(^)    =    A.yCAyC   +   AyQAye    +  AyyAyy 


(B.26) 
(B.27) 
(B.28) 


It  can  be  shown  (see  Appendix)  that  for  state  propagation  equation  (B.18),  if 
P  is  symmetric,  positive  semi-definite  and  can  be  decomposed  into  the  product  of  its 
square  roots,  i.e.,  P  =  A  AT  ,  then  solving  equation  (B.18)  is  equvalent  to  solving 
the  following  equation: 


A  =  F  A 


(B.29) 


Thus  instead  of  solving  the  original  covariances  equations,  the  much  more 
simplified  equation  (B.29)  that  is  in  terms  of  the  square  root  of  the  covariance  matrix 
can  be  solved.  Equation  (B.29)  can  be  written  more  explicitly  as 


Acc 

Ac0 

A 

_  cy 

A0C 

Age 

Agy 

A 

Aye 

A 

yy  J 

F    F  F 

c     *-  qi     A  y 


0  0 
0  0 


0 
0 


Acc 

Acg 

A 

Agc 

Agg 

Agy 

A 

Ayg 

A 

^■yy 

It  follows  that 


Q 

Acc  =  —Acc  =  FCACC  +  F9i  Agc  +  FyA 


Acg 
A 

■"■cy 
Agc 
Agg 
Agy 

A 

Ayg 
A 

■"yy 


*  —cc      —  c-—cc  1  ~  qi^'-vc  1  *  y-^'-yc 

d 

— Acg  —  FcAcg  +  FQiAee  +  FyAyg 

dt  cy 


F c-A-cy  H"  ^q{A-9y  ^y-^-yy 


d 

^7A^  =  0 
at 

d  . 

mAee  = 0 

FAgy  =  0 

^A  -0 

dt   yc  ~ 

F±ye  =  0 
d_ 

dt    v  ~  ° 


(B.30) 

(B.31) 
(B.32) 
(B.33) 
(B.34) 
(B.35) 
(B.36) 
(B.37) 
(B.38) 
(B.39) 
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Equations  (B.34),  (B.35),  (B.36),  (B.37),  (B.38),  and  (B.39)  simply  state  that 
some  sub-matrices  of  the  square  root  matrix  A  remain  invariant  with  respect  to  time 
and  maintain  their  initial  values.  If  the  initial  concentration  (t  =  t0)  is  perfectly 
known,  then  the  covariances  involving  concentration  perturbation  become  zero,  i.e., 
Pcc,  Pqc,  Pyc,  Pcg,  and  Pcy  =  0  when  t  =  t0.  This  will  lead  to 

Acc  =  0,  t  =  t0  (B.40) 
A8c  =  0,  t  =  any  time  (B.41) 
Ayc  =  0,       t  =  any  time  (B.42) 

Substituting  equation  (B.41)  and  equation  (B.42)  into  equation  (B.31),  we  have 

d 

Acc  =  -Acc  =  0  (B.43) 

which  means  that  Acc  also  remain  time-invariant. 

Above  analysis  indicates  that  only  two  (four  indeed,  since  9T  =  (qi,q2,qs)r) 

— * 

sub-matrices  of  A  ,  namely  Acg  and  Acy,  propagate  with  time  while  the  rest  remain 
time-invariant.  Therefore,  to  solve  equation  (B.29),  only  equations  (B.32)  and  (B.33) 
need  to  be  solved.  Thus  the  square  root  decomposition  algorithm  greatly  reduces  the 
total  number  of  equations  to  be  solved.  The  covariance  matrix  P  can  be  reconstructed 
by  applying  equations  (B.23)  through  (B.28)  at  any  desired  time. 

Before  solving  the  moment  propagating  equation,  some  elements  of  the  com- 
posite state  covariance  matrix  Pzz  must  be  computed,  these  element  covariances 
include 


P  P 

1  QiQj  1  QiV 

P  P 

ryij  ryy 


(B.44) 


As  discussed  earlier,  these  covariances  remain  temporally  unchanged  in  the  uncon- 
ditional moment  propagation.  Since  the  above  matrix  is  symmetric  and  positive- 
definite,  its  square  root  can  be  specially  found  via  Cholesky  decomposition  algorithm 
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[95]  (see  Appendix  D) .  Cholesky  decomposition  constructs  a  lower  triangular  matrix 
whose  transpose  can  itself  serve  as  the  upper  triangular  part.  The  apparent  advan- 
tage is  that  data  storage  for  non-zero  elements  occupying  the  lower  triangular  portion 
of  the  square-root  matrix  is  reduced  almost  by  half  compared  to  that  for  a  square 
matrix.  Instead  of  N2  elements,  only  N  x  (N  +  l)/2  elements  are  stored  for  the 
triangular  matrix. 


APPENDIX  C 
DEFINITION  OF  PARAMETERS  IN  TABLES  3.3-3.6 


Table  C.l  defines  the  statistical  parameters  appearing  in  Tables  3.3  3.6. 


Table  C.l:  Definition  of  the  Prior  and  Posterior  statistics  used  in  Table  3.3  through 
Table  3.6. 


Parameter 

Definition 

Prior  statitics 
Unconditional 

Mean 

y  =  -4.619 

oy  =  0.864 

Residual 

Mean 

lnfl*„(xi)  -  j/ 

Std. 

a<t>  -  y  E?=i 

-1 2 

ln6>„(x,)  -  y-<f> 

Normalized 
residual 

Mean 

r         |  y-vji  \n(ln(x,)-y 
n  i-^i—  1  av 

Std. 

^  =  V^t  E,n=i 

In  §n(xi)-y  ^ 

2 

Posterior 
Conditional 

Mean 

A  

y  =  £Er=iln^«(xi) 

Oy 

av  =  y^i  ELi 

-      -                  i  2 

ln0n(xi)  -  £ 

Residual 

Mean 

ln6»n(xi)  -  ln0„(xj) 

Std. 

a.                -  "I  2 

ln#n(xt)  -  ln6»„(xi)  -  4> 

Normalized 
residual 

Mean 

^  _  1  v^n  In0„(xj)-ln0„(xj) 

Std. 

r    -           -            «  1 2 

In0n(xi)-ln0„(xi)  ^ 
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APPENDIX  D 
CHOLESKY  DECOMPOSITION 


If  a  square  matrix  A  happens  to  be  symmetric  and  positive  definite,  then  it  has 
a  special,  efficient,  triangular  decomposition.  Symmetric  means  that  =  aji,  for 
£,,7  =  1,...  ,  iV,  while  positive  definite  means  that 

v  ■  A  •  v  >  0    for  all  vectors    v  (D.l) 

positive  definite  has  the  equivalent  interpretation  that  A  has  all  positive  eigenvalues. 

Instead  of  seeking  arbitrary  lower  and  upper  triangular  factors  L  and  U, 
Cholesky  decomposition  constructs  a  lower  trigular  matrix  L  whose  transpose  LT 
can  itself  serve  as  the  upper  triangular  part,  i.e., 

L  •  LT  =  A  (D.2) 

this  factorization  is  sometimes  referred  to  as  "taking  the  square  root"  of  the  matrix 

A.  The  elements  of  L,  are  defined  through  the  following  recursive  expressions  [95], 

i 

l— 1  \  2 


k-l 


and 


i-l 


Lji  =  ~b  y13  ~~    Lik Ljk)  j  =     i+2,  •  •  • ' N  (D-4) 

If  the  above  equations  are  solved  in  the  order  £  =  1, 2, . . .  ,  N,  then  the  elements 
of  L  that  occur  on  the  right-hand  side  of  equations  (D.3)  and  (D.4)  are  already 
determined  by  the  time  they  are  needed  to  evaluate  the  elements  of  L  that  occur  on 
the  left-hand  side.  Also,  only  components      with  j  >  i  are  referred. 
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APPENDIX  E 

FORMULATION  OF  THE  ITERATIVE  LINE-SOR  SCHEME 


Development  of  the  iterative  Line-SOR  finite-difference  scheme  is  demonstrated  with 
the  ensemble  concentration  mean  equations  ((2.20)  -(2.22)).  These  equations  are 
rewritten  as  follows: 


/i(x) 


<9c(x,  t) 
dt 


where 


(KNe")-[Ptc(x,x,t)]  -  jj-foWefot)] 


dxi 


xgD  (E.l) 


titx.)  =  ew(x)+KNe*M 


(E.2) 


with  initial  condition 


c(x,  t)  =  c0,       x  G  D,    t  =  t0 


(E.3) 


The  boundary  conditions  are  summarized  as:  at  the  inflow  boundary 

dc(x  t) 

9x(x)c(x,  t)  +  P9xC(x,  x,  t)  -  9WDX    ^  '     =  gx(x)  C6,       x  e  x  (0,  y,  z) 


<9x 


At  the  outflow  boundary 


dc(x,  £) 


3x 


At  the  y-direction  boundaries, 
<9c(x,  £) 

At  the  z-direction  boundaries 

dc(x,  i) 
<9z 


0,       x  G  x(Zx,  y,  z) 


0,        xGx(i,0,  z)    or  (x,ly,z) 


0,       xex(i,j/,2)    or  (x,y,lz) 


(E.4) 


(E.5) 


(E.6) 


(E.7) 
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where  x(x,  y,  z)  represents  the  location  vector  in  space,  and  lx,  ly,  lz  are  the  dimensions 
of  the  simulation  domain  in  x-,  y-,  and  z-  directions. 

Equations  (E.l)  through  (E.7)  involves  partial  derivatives  which  can  be  rep- 
resented by  finite  difference  approximations.  Note  that  this  scheme  is  second-order, 
central-differencing  in  space,  and  first-order,  backward  in  time.  Note  that  in  the  fol- 
lowing equations  Ax,  Ay,  Az  are  spatial  grid  sizes,  and  At  is  the  time  step.  The 
indices  for  spatial  discretization  in  x-,  y-,  and  z-  directions  are  i,  j,  and  k,  respectively; 
and  time  index  is  n. 

(1)  The  time-derivative  term  in  equation  (E.l): 

dc(x,t)      »ij,k%j,l  ~  thj.k^jjc  .  . 

Mx,_5t  At  (  } 

(2)  The  advection  term  in  equation  (E.l): 

Recall  that  under  steady-state  flow  condition,  Darcy  flux  satisfies: 

dqL  =  dq^  +  dq]L  +  dq1  =  {) 
dxi      dx      dy  dz 

The  random  field  is  expanded  into  the  sum  of  a  mean  and  zero-mean  pertur- 
bation, i.e.,  qi  =  qi  +  8 qi.  Substituting  this  into  equation  (E.9)  and  taking  expectation 
yield  the  divergence  equation  for  Darcy  flux  mean 

dqL  =  dqL  +  dql  +  dq1  =  Q 
dxi      dx      dy  dz 

Subtracting  equation  (E.10)  from  equation  (E.9)  yields  the  perturbation  equation  for 

the  Darcy  flue  divergence, 

d{6  gi)  =  d{6  qx)  |  d(6  gy)  |  d(6  gz)  =  Q 
dxt  dx  dy  dz 

The  advection  term  in  equation  (E.l)  can  be  written  as 

—  [ft(x)c(x,t)]  =9i(x)^[c(x,t)]+c(x,t)^-[ft(x)]  (E.12) 
Introduction  of  equation  (E.10)  into  equation  (E.12)  yields 

—  [ft(x)c(x,  *)]  =  ft(x)  ^-  [c(x,  t)}  (E.13) 
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The  finite-difference  expansion  for  equation  (E.13)  can  now  be  written  as 

[c(x,i)]  -e^  (<%u  -  «jj 


+  (!  _  0  2  a^'    ^J+1>*  ~~  ^J-i.fc) 


(E.14) 


(E.15) 


(E.16) 


where  f  is  a  scheme  coefficient;  note  that  when  £  =  0.5,  above  differencing  method  is 
called  Crank- Nicols on  method;  and  £  =  0  implies  an  explicit  scheme,  while  £  =  1.0 
implies  an  implicit  scheme. 

(3)  Local  dispersive  terms  in  equation  (E.l) 


where 


[0w(x)  Dx{y)  Tj-cfx,*)] 


CAT  L  OX 


77-   -^71+1        ,    jd    -n+L    ,     *  -n+L 

1      ?J  (Ax)* 


+  [^t«]i+lj,/c  [Dx]i+\j,k  ) 


(E.17) 


=  \  ( [tfji-ij,*  [^*]i-ijjfc  +  K]iJtk  [Dx)i,j,k )  (E.18) 


(E.19) 


Ax  =  -  (  [A,],-,,-,*  +  [tfwli+ij.fc  [DJh-ij^  )  (E.20) 
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and  the  component  in  y-direction  is 

d  d 
-[Mx)A,(x)  -c(x,t)] 


it.    -n+1         ,     f>    ^n+l     ,      a  -n+1 

^     ■C/3/Li,j-l,fc  ^  DVcij,k  ^  -^y  Lt,j+l,fc 


+  (i-0 


(A,)2 


where 


By  -  ~  ~  (  [0W]i,j-l,k  [Dylij-l^  +  2  [0«,]ij,jfc  [-Dyjij,* 
+  [Mij'+l.A:  [A,]i,j+1,A;  ) 


and  the  component  in  z-direction  is 

^[Mx)D,(x)  ^c"(x^)]  = 


e     ^cij,fc-l  +  D*cij,k  +  ^cij,fc+l 
^  (AZ)2 

1      ^  '  (A,)2 


where 


2 

(4)  Forcing  terms 


(E.21) 


Ey  -  2  ( t^wta-i,*  Pvta-i.*  +  [-^y]«j,fc )  (E.22) 


(E.23) 


A/  —  2  (  ^"Iv,*  [^vta'.*  +  [^w]tj+i,*  [-Dylij+i,* )  (E.24) 


(E.25) 


^  =  \  ( [£>*k;,*-i  +  K]^  [Dz]ij,k )  (E.26) 


(E.27) 


A*  =  \  ( +  Mj-fc+i  )  (E.28) 
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The  remaining  two  terms  in  the  concentration  mean  equation  (E.l)  are  forcing 
terms  through  which  equation  (E.l)  is  coupled  with  the  concentration  covariance 
equations.  They  are  available  only  at  the  time  step  prior  to  the  current  time  step, 
hence  can  not  be  weighted  with  any  implicit  schemes  or  the  Crank-Nicolson  scheme. 
However,  this  situation  may  be  improved  if  an  iterative  looping  procedure  over  the 
entire  system  is  adopted. 

The  finite  difference  equations  for  the  forcing  terms  are  written  as 

n  n+l 


d_ 
dx 


n+l 
i+l,j,k\i+l,j,k 


qxc 


i-l,j,k\i-l,j,k 


2  A, 


<7xC 


i+\,j,k\i+l,j,k 


P, 


i-\,j,k\i-l,j,k 


2  A, 


(E.29) 


d_ 
dy 


Pqycfe-i  t) 


qyc 


n+l 


ij+l,fc|tj+l,fc 


P 


qyc 


n+l 


ij-l,k\i,j-l,k 


2  A, 


9»c 


ij+l,k\i,j+l,k 


P, 


qyc 


i,j-l,k\i,j-l,k 


2  A, 


(E.30) 


d_ 

dz 


+  (i-0 


qzc 


n+l 


-  i,j,k+l\i,j,k+l 


qzC 


n+l 

ij,fc-l|tj',fe-l 


2A2 


P, 


ij,k+l\i,j,k+l 


P 


i,j,k-l\ij,k~l 


2A2 


(E.31) 


And  the  finite-difference  equation  for  the  time-derivative  of  Pyc(x.,x.,t)  is: 

0  r  1      1     f\    in+1        [    f  \ 

-  Ptfc(x,x,*)    =  —  •  (   Pye  Pyc  )  (E.32) 

L  -I  <       \  L      J  ij,k\ij,k       L      J  i,j,k\i,j,k/ 

Note  that  equation  (E.32)  multiplied  by  a  KNe^id'k^  forms  finite-difference  equation 
for  the  second  forcing  term  in  equation  (E.l). 
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In  order  for  the  proceeding  discussion  to  be  simple,  the  summation  of  equations 
(E.29)  through  (E.32)  is  defined  as  "FORCING".  Combining  equations  (E.8)  through 
(E.32)  yields 

At 

(AAX  •  ^  +  CCX  ■  c£l.k  +  AAY  •  c%\k 

CCY  ■  <%lhk  +  AAZ  ■  «5+J_,  +  CCZ  ■  c$+1)  +  BB  •  c£ 
+  HIST  +  FORCING 


k 


(E.33) 


where 


AAX  = 


2  A, 


+ 


E,. 


(Ax)s 


(E.34) 


2  A, 


(Ax)5 


(E.35) 


AAY  =  \$M*+  5 


2  A, 


(Ay): 


(E.36) 


CCF  =  -  +  *y 


2  A, 


(A,)2 


(E.37) 


(E.38) 


CCZ  =  -  ^MJL  +  2 


2  A, 


(Az)1 


(E.39) 


55  = 


+ 


B 


+  Bz 


(Ax)2      (Ay)2  (A,)2 


(E.40) 


#/ST  =(1  -  o 


•  +  CCX  ■  c«+lj.  „  +  AAF  •  cX_1)fc 


CCY  ■  c?d+ljk  +  AAZ  ■  clhk_x  +  CCZ  •  clhk+l)  +  BB  • 


(E.41) 
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Rearranging  equation  (E.33),  the  LSOR  scheme  (in  x-direction)  is  then  ob- 
tained: 

k  •  a  ax)  .  c-rs+1 + (V  bb  -  ^f)  •  ^r+i + k  •  ^  •  skst1 

+ ^  •  Ci? 1 + ccz  ■  CiTi)  (E-42) 

-  ^r*  •  3\  fc  -  HIST  -  FORCING 

where  superscript  m  is  an  iteration  index.  Equation(E.42)  is  referred  to  as  the  suc- 
cessive overrelaxation  method.  To  implement  SOR,  one  must  do 


■JT+  =(i-^r+<ir+1  (E.43) 

in  which  1  <  uj  <  2. 
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